背景
一个很简单的需求,需要对controller层返回的异常报文,做简单的包装,比如将 异常、error、exception等关键字去掉(由于历史代码 各种异常提示不友好,而c端直接把提示弹出),替换成更友好的提示。
方案一 在Filter里面把response拿到,读取内容,然后替换。
方案二 使用ResponseBodyAdvice,对返回的报文做处理
最终使用了方案二,更简单。一开始用方案一的时候,就遇到http请求超时,tcp未正常四次挥手断连。
问题复现
代码
采用方案一的代码如下
@Slf4j
@Component
@Order(-1)
@WebFilter(urlPatterns = "/*", filterName = "filter")
public class ApiLogFilter extends OncePerRequestFilter {
@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException {
CachedHttpServletResponse wrapper = new CachedHttpServletResponse(response);
filterChain.doFilter(request, wrapper);
String content = "hello world" ;
response.getOutputStream().write(content.getBytes());
response.getOutputStream().flush();
}
class CachedHttpServletResponse extends HttpServletResponseWrapper {
private boolean openWriter = false;
private boolean openStream = false;
ServletOutputStream outputStream = null;
private ByteArrayOutputStream output = new ByteArrayOutputStream();
public CachedHttpServletResponse(HttpServletResponse response) {
super(response);
}
public PrintWriter getWriter() throws IOException {
if (openStream) {
throw new IllegalStateException("Cannot re-open writer!");
}
openWriter = true;
return new PrintWriter(output, false);
}
public ServletOutputStream getOutputStream() throws IOException {
if (openWriter) {
throw new IllegalStateException("Cannot re-open writer!");
}
openStream = true;
if (outputStream == null) {
outputStream = new ServletOutputStream() {
public boolean isReady() {
return true;
}
public void setWriteListener(WriteListener listener) {
}
public void write(int b) throws IOException {
output.write(b);
}
};
}
return outputStream;
}
public byte[] getContent() {
return output.toByteArray();
}
}
}
post调用,会发现请求无响应,一直到超时,返回了报文。
场景分析
超时场景
wireshark分析异常下的tcp报文 过滤条件: (ip.src == 192.168.42.59 && ip.dst == 192.168.19.15 && tcp.srcport == 20225) || (ip.dst == 192.168.42.59 && ip.src == 192.168.19.15 && tcp.dstport == 20225) 其中192.168.42.59是服务端, 192.168.19.15是客户端(postman)
从抓包来看,正常的三次握手,然后传输数据报文,并且可以看到服务端有向客户端返回响应报文,但之后就卡主了。等待一会超时,再看下tcp 服务端先发起了挥手动作,tcp才开始断连。
正常场景
上是异常下的tcp报文,下面看下正常的一次请求,代码层面,就是把ApiLogFilter 去掉。
正常的一次http请求,最终由客户端发起挥手,并断连tcp。
问题分析
结论
问题就是客户端在收到报文时,没有主动关闭tcp,发送fin报文,导致tcp连接一直在,然后tcp服务端本身超时关闭,所以是服务端主动发起关闭。
为啥客户端没有主动发送fin报文,这个涉及到http协议,http协议里面的Content-Length,当客户端收到content-length大于实际接受的字节,会认为还是数据要接收,然后就没有关闭tcp,等待接受剩下的数据,但是服务端又没有数据可传输,最终服务端tcp超时关闭连接。也就是说,http协议里面的content-length需要和实际的报文大小一样,否则会让客户端误判。
如果content-length大于实际报文大小,会导致客户端认为还有数据要传输,就不会主动发送fin报文,关闭连接。 如果content-length小于实际报文大小,客户端会截取报文,数据丢失。
过程
Tomcat源码
先跟踪下Tomcat一个请求路径,找出正常场景和异常场景链路的差异性。
Tomcat正常请求路径:
tomcat 启动后Acceptor就专门监听连接,只并封装成event,传给poller去干活
org.apache.tomcat.util.net.Acceptor#run
===>
org.apache.tomcat.util.net.NioEndpoint.Poller#run
org.apache.tomcat.util.net.NioEndpoint.Poller#processKey
org.apache.tomcat.util.net.AbstractEndpoint#processSocket
===>
org.apache.tomcat.util.net.SocketProcessorBase#run
org.apache.tomcat.util.net.SocketProcessorBase#doRun
==>
org.apache.coyote.http11.Http11Processor#service
org.apache.coyote.http11.Http11InputBuffer#parseRequestLine
org.apache.tomcat.util.net.NioEndpoint.NioSocketWrapper#fillReadBuffer(boolean, java.nio.ByteBuffer)
==>
接下来去走执行逻辑
spring处理,controller service等
==>
org.apache.coyote.http11.Http11Processor#service里面while循环
org.apache.coyote.http11.Http11InputBuffer#parseRequestLine
org.apache.tomcat.util.net.NioEndpoint.NioSocketWrapper#fillReadBuffer(boolean, java.nio.ByteBuffer)
这个时候再去socketChannel里面读取请求报文数据,返回eofException,结束流程
==>
org.apache.tomcat.util.net.NioEndpoint.SocketProcessor#doRun
里面处理完逻辑返回的SocketState 是Closed
==>
org.apache.tomcat.util.net.NioEndpoint#close
org.apache.tomcat.util.net.NioEndpoint.Poller#cancelledKey
(ka.getSocket().close(true);)最终服务端也关闭socket(发送fin)
客户端一收到报文,就正常发送fin了,所以客户端会先发,服务的进入close_wait,这里是服务的发送fin
服务端关闭tcp,发送fin org.apache.tomcat.util.net.NioEndpoint.Poller#cancelledKey
public NioSocketWrapper cancelledKey(SelectionKey key) {
NioSocketWrapper ka = null;
try {
if ( key == null ) return null;
ka = (NioSocketWrapper) key.attach(null);
if (ka != null) {
getHandler().release(ka);
}
if (key.isValid()) key.cancel();
if (ka != null) {
try {
ka.getSocket().close(true);
} catch (Exception e){
if (log.isDebugEnabled()) {
log.debug(sm.getString(
"endpoint.debug.socketCloseFail"), e);
}
}
}
if (key.channel().isOpen()) {
try {
key.channel().close();
} catch (Exception e) {
if (log.isDebugEnabled()) {
log.debug(sm.getString(
"endpoint.debug.channelCloseFail"), e);
}
}
}
try {
if (ka != null && ka.getSendfileData() != null
&& ka.getSendfileData().fchannel != null
&& ka.getSendfileData().fchannel.isOpen()) {
ka.getSendfileData().fchannel.close();
}
} catch (Exception ignore) {
}
if (ka != null) {
countDownConnection();
ka.closed = true;
}
} catch (Throwable e) {
ExceptionUtils.handleThrowable(e);
if (log.isDebugEnabled()) log.error("",e);
}
return ka;
}
异常Tomcat请求路径:
上面和正常一样
==>
接下来去走执行逻辑
spring处理,controller service等,然后spring会给response的content-length设置长度,但是在后面的filter里面,却把流又重写了,导致content-length与实际流大小不一致(这就是问题,必须要重新设置content-length)
==>
org.apache.coyote.http11.Http11Processor#service里面while循环
org.apache.coyote.http11.Http11InputBuffer#parseRequestLine
org.apache.tomcat.util.net.NioEndpoint.NioSocketWrapper#fillReadBuffer(boolean, java.nio.ByteBuffer)
这个时候再去socketChannel里面读取请求报文数据,因为客户端还在等待接受数据,tcp未断开,这个时候SocketChannel.read返回0
==>
org.apache.tomcat.util.net.NioEndpoint.SocketProcessor#doRun
里面处理完逻辑后SocketState 是Open,放入processorCache等待超时
==>
poller线程拉取数据处理,刚刚放入processorCache的socket,发现超时,服务端走关闭逻辑
org.apache.tomcat.util.net.NioEndpoint.SocketProcessor#doRun
protected void doRun() {
NioChannel socket = socketWrapper.getSocket();
SelectionKey key = socket.getIOChannel().keyFor(socket.getPoller().getSelector());
try {
int handshake = -1;
try {
if (key != null) {
if (socket.isHandshakeComplete()) {
handshake = 0;
} else if (event == SocketEvent.STOP || event == SocketEvent.DISCONNECT ||
event == SocketEvent.ERROR) {
handshake = -1;
} else {
handshake = socket.handshake(key.isReadable(), key.isWritable());
event = SocketEvent.OPEN_READ;
}
}
} catch (IOException x) {
handshake = -1;
if (log.isDebugEnabled()) log.debug("Error during SSL handshake",x);
} catch (CancelledKeyException ckx) {
handshake = -1;
}
if (handshake == 0) {
SocketState state = SocketState.OPEN;
if (event == null) {
state = getHandler().process(socketWrapper, SocketEvent.OPEN_READ);
} else {
state = getHandler().process(socketWrapper, event);
}
if (state == SocketState.CLOSED) {
close(socket, key);
}
} else if (handshake == -1 ) {
close(socket, key);
} else if (handshake == SelectionKey.OP_READ){
socketWrapper.registerReadInterest();
} else if (handshake == SelectionKey.OP_WRITE){
socketWrapper.registerWriteInterest();
}
} catch (CancelledKeyException cx) {
socket.getPoller().cancelledKey(key);
} catch (VirtualMachineError vme) {
ExceptionUtils.handleThrowable(vme);
} catch (Throwable t) {
log.error("", t);
socket.getPoller().cancelledKey(key);
} finally {
socketWrapper = null;
event = null;
if (running && !paused) {
processorCache.push(this);
}
}
}
服务端超时关闭tcp org.apache.tomcat.util.net.NioEndpoint.Poller#timeout
protected void timeout(int keyCount, boolean hasEvents) {
long now = System.currentTimeMillis();
if (nextExpiration > 0 && (keyCount > 0 || hasEvents) && (now < nextExpiration) && !close) {
return;
}
int keycount = 0;
try {
for (SelectionKey key : selector.keys()) {
keycount++;
try {
NioSocketWrapper ka = (NioSocketWrapper) key.attachment();
if ( ka == null ) {
cancelledKey(key);
} else if (close) {
key.interestOps(0);
ka.interestOps(0);
processKey(key,ka);
} else if ((ka.interestOps()&SelectionKey.OP_READ) == SelectionKey.OP_READ ||
(ka.interestOps()&SelectionKey.OP_WRITE) == SelectionKey.OP_WRITE) {
boolean isTimedOut = false;
if ((ka.interestOps() & SelectionKey.OP_READ) == SelectionKey.OP_READ) {
long delta = now - ka.getLastRead();
long timeout = ka.getReadTimeout();
isTimedOut = timeout > 0 && delta > timeout;
}
if (!isTimedOut && (ka.interestOps() & SelectionKey.OP_WRITE) == SelectionKey.OP_WRITE) {
long delta = now - ka.getLastWrite();
long timeout = ka.getWriteTimeout();
isTimedOut = timeout > 0 && delta > timeout;
}
if (isTimedOut) {
key.interestOps(0);
ka.interestOps(0);
ka.setError(new SocketTimeoutException());
if (!processSocket(ka, SocketEvent.ERROR, true)) {
cancelledKey(key);
}
}
}
}catch ( CancelledKeyException ckx ) {
cancelledKey(key);
}
}
} catch (ConcurrentModificationException cme) {
log.warn(sm.getString("endpoint.nio.timeoutCme"), cme);
}
long prevExp = nextExpiration;
nextExpiration = System.currentTimeMillis() +
socketProperties.getTimeoutInterval();
if (log.isTraceEnabled()) {
log.trace("timeout completed: keys processed=" + keycount +
"; now=" + now + "; nextExpiration=" + prevExp +
"; keyCount=" + keyCount + "; hasEvents=" + hasEvents +
"; eval=" + ((now < prevExp) && (keyCount>0 || hasEvents) && (!close) ));
}
}
}
org.apache.tomcat.util.net.NioEndpoint.NioSocketWrapper#fillReadBuffer
private int fillReadBuffer(boolean block, ByteBuffer to) throws IOException {
int nRead;
NioChannel channel = getSocket();
if (block) {
Selector selector = null;
try {
selector = pool.get();
} catch (IOException x) {
}
try {
NioEndpoint.NioSocketWrapper att = (NioEndpoint.NioSocketWrapper) channel
.getAttachment();
if (att == null) {
throw new IOException("Key must be cancelled.");
}
nRead = pool.read(to, channel, selector, att.getReadTimeout());
} finally {
if (selector != null) {
pool.put(selector);
}
}
} else {
nRead = channel.read(to);
if (nRead == -1) {
throw new EOFException();
}
}
return nRead;
}
参考
Tomcat 中的 NIO 源码分析
用了这么久HTTP, 你是否了解Content-Length和Transfer-Encoding ?
NIO中SocketChannel read()返回0的原因
|