本篇幅只是回顾使用钉钉做异常告警需要那些关键业务信息。
为什么要做钉钉通知?
事情要从我入职上家公司说起,进入公司后把线上项目clone下来大致看了下。代码风格过于滞后、编码风格混乱。进入公司第一周就出现了线上故障,嗯。我去线上检查日志,emmmm竟然没有日志输出。这次故障是由客户反馈来的。当时我非常吃惊,大伙好像很淡定的样子,习以为常了?
想到当初面试的时候和总监的谈话,主要是带领团队落地微服务架构,看来必须大刀阔斧了。
首先想到的时候改进日志输出、定义全局异常级别,根据异常级别输出日志。
1 2 3 4
| sequenceDiagram Java应用 ->> SpringAop全局异常拦截 : runtimeException SpringAop全局异常拦截 ->> 钉钉 : alarmNoticeSend 钉钉 ->> 开发人员 : exceptionInfo
|
我们需要从钉钉里面看到那些异常信息呢?这是当时输出到钉钉的信息。通过编写全局拦截器,在公共基础项目里面添加了aop全局拦截。刚开始上线的时候钉钉一天动不动就几千个异常告警。刚开始大伙都很紧张,过了个把月大伙已经又麻木了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| 告警信息 工程名:z201-gateway 类路径:cn.z201.cloud.gateway.VlinkFrameworkGatewayApplicationTest 方法名:alarm 异常信息:java.lang.IllegalAccessException 异常追踪: cn.z201.cloud.gateway.VlinkFrameworkGatewayApplicationTest.alarm(VlinkFrameworkGatewayApplicationTest.java:64) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) org.springframework.test.context.junit4.statements.RunBeforeTestExecutionCallbacks.evaluate(RunBeforeTestExecutionCallbacks.java:74)
|
这样的告警信息就够了吗?
明显这样是不够够的,前端有安卓、ios、微信小程序、web、快应用。太多前端项目了,后端需要识别出是哪里的项目出的问题。于是又改进了一次。邀请前端开发人员在HttpHeader里面增加额外参数。为了做流量区分也增加了一些参数。
1 2 3 4 5 6 7 8 9 10 11
| { "Content-Type":"application/json" "Authorization":"Bearer xxxxxxxxxx" ## jwt "Client-Business-Group-Source": "1", ## 业务组来源唯一标号 "Client-Business-Source": "1000", ## 业务来源唯一标号 "Client-Business-Activity-Source": "1", ## 查看介绍、更多 针对特殊业务流量识别 "Client-Env-Source": "1", ##客户端环境来源 1 ios 2 android 3 windows "Client-Platform-Source": "xxx", ##客户端平台 xxx手机型号、浏览器 "Client-Start-Time": "1", ##请求时间戳 "Client-Version-Source": "1.0.0" ##客户端版本号 }
|
通过上面的改进告警信息完善很多。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| 告警信息 工程名:z201-gateway 类路径:cn.z201.cloud.gateway.VlinkFrameworkGatewayApplicationTest 方法名:alarm 异常信息:java.lang.IllegalAccessException 异常扩展信息: { "Client-Business-Source": "1000", "Client-Business-Activity-Source": "1", "Client-Env-Source": "1", "Client-Platform-Source": "xxx", "Client-Version-Source": "1.0.0" } 异常追踪: cn.z201.cloud.gateway.VlinkFrameworkGatewayApplicationTest.alarm(VlinkFrameworkGatewayApplicationTest.java:64) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) org.springframework.test.context.junit4.statements.RunBeforeTestExecutionCallbacks.evaluate(RunBeforeTestExecutionCallbacks.java:74)
|
分布式下面临的问题!
1 2 3 4 5
| sequenceDiagram gateway ->> 应用A : httpRequest 应用A ->> 应用B : httpRequest 应用B -->> 应用A : httpResponse 应用A -->> gateway : httpResponse
|
当调用链多的时候定位问题就有点麻烦,比如应用a调用应用b。应用b执行了异常信息直接抛出了告警信息。但是spring cloud http rpc默认是不会吧请求参数传递到后面的服务中,需要我们做下简单的扩展。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89
|
public interface HttpApiConstant {
String X_REAL_IP = "x-real-ip";
String HTTP_HEADER_TRACE_ID = "AppTraceId";
String HTTP_TOKEN_HEADER = "Authorization";
String APP_TENANT = "Tenant";
String CLIENT_BUSINESS_GROUP_SOURCE = "Client-Business-Group-Source";
String CLIENT_BUSINESS_SOURCE = "Client-Business-Source";
String CLIENT_BUSINESS_ACTIVITY_SOURCE = "Client-Business-Activity-Source";
String CLIENT_EVN_SOURCE = "Client-Env-Source";
String CLIENT_PLATFORM_SOURCE = "Client-Platform-Source";
String CLIENT_START_TIME = "Client-Start-Time";
String CLIENT_VERSION_SOURCE = "Client-Version-Source";
}
@Slf4j @ConditionalOnClass(WebMvcConfigurer.class) public class MdcFeignInterceptorConfig implements RequestInterceptor, HttpApiConstant {
public MdcFeignInterceptorConfig() { log.info("Loaded Z-REST-INTERCEPTOR [V1.0.0]"); }
@Override public void apply(RequestTemplate template) { try { HttpServletRequest request = ((ServletRequestAttributes) RequestContextHolder.currentRequestAttributes()) .getRequest(); String xRealIp = request.getHeader(X_REAL_IP); String authentication = request.getHeader(HTTP_TOKEN_HEADER); String appTraceId = request.getHeader(HTTP_HEADER_TRACE_ID); String businessGroupSource = request.getHeader(CLIENT_BUSINESS_GROUP_SOURCE); String clientBusinessSource = request.getHeader(CLIENT_BUSINESS_SOURCE); String clientBusinessActivitySource = request.getHeader(CLIENT_BUSINESS_ACTIVITY_SOURCE); String clientEnvSource = request.getHeader(CLIENT_EVN_SOURCE); String clientPlatformSource = request.getHeader(CLIENT_PLATFORM_SOURCE); String clientStartTime = request.getHeader(CLIENT_START_TIME); String clientVersionSource = request.getHeader(CLIENT_VERSION_SOURCE); template.header(HttpHeaders.ACCEPT_ENCODING, "gzip"); template.header(X_REAL_IP, xRealIp); template.header(HTTP_TOKEN_HEADER, authentication); template.header(CLIENT_BUSINESS_GROUP_SOURCE, businessGroupSource); template.header(CLIENT_BUSINESS_SOURCE, clientBusinessSource); template.header(CLIENT_BUSINESS_ACTIVITY_SOURCE, clientBusinessActivitySource); template.header(CLIENT_EVN_SOURCE, clientEnvSource); template.header(CLIENT_PLATFORM_SOURCE, clientPlatformSource); template.header(CLIENT_START_TIME, clientStartTime); template.header(CLIENT_VERSION_SOURCE, clientVersionSource); if (log.isDebugEnabled()) { Enumeration<String> headerNames = request.getHeaderNames(); while (headerNames.hasMoreElements()) { String name = headerNames.nextElement(); String value = request.getHeader(name); log.debug("header {} - {}", name, value); } } } catch (Exception e) { log.error("template exception {}",e.getMessage()); } } }
|
简单的异常钉钉告警就到这里结束了。