开发者

Android ANR无响应分析解决方案

开发者 https://www.devze.com 2022-12-16 10:19 出处:网络 作者: itbird01
目录1.什么是 ANR2.ANR 的类型以及对比每种类型的规避解决方法2.1 KeyDispatchTimeout3.如何分析ANR日志?4.案例分析4.1 IO wait举例4.2Memoryleak/Thread leak1.什么是 ANR
目录
  • 1.什么是 ANR
  • 2.ANR 的类型以及对比每种类型的规避解决方法
    • 2.1 KeyDispatchTimeout
  • 3.如何分析ANR日志?
    • 4.案例分析
      • 4.1 IO wait举例
      • 4.2Memoryleak/Thread leak

    1.什么是 ANR

    ANR:Application Not Responding ,即应用无响应

    2.ANR 的类型以及对比每种类型的规避解决方法

    ANR 一般有三种类型:

    1)KeyDispatchTimeout(5 seconds) 按键或触摸事件在特定时间内无响应

    2)BroadcastTimeout(10 seconds) BroadcastReceiver 在特定时间内无法处理完成

    3)ServiceTimeout(20 seconds or 200 seconds) 前台服务:超时时间是 20s; 后台服务,则超时时间是200s; Service 在特定的时间内无法处理完成

    2.1 KeyDispatchTimeout

    Akey or touch event was not dispatched within the specified time (按键或触摸事件在特定时间内无响应)

    具体的超时时间的定义在 framework 下的

    ActivityManagerService.Java
    //How long we wait until we timeout on key dispatching.
    staticfinal int KEY_DISPATCHING_TIMEOUT = 5*1000
    

    为什么会超时呢? 超时时间的计数一般是从按键分发给 app 开始。超时的原因一般有两种 : 1)当前的事件没有机会得到处理(即 UI 线程正在处理前一个事件,没有及时的完成或者 looper 被某种原因阻塞住了) 2)当前的事件正在处理,但没有及时完成

    如何避免 KeyDispatchTimeout? 1)UI 线程尽量只做跟 UI 相关的工作 2)耗时的工作(比如数据库操作, I/O ,连接网络或者别的有可能阻碍 UI 线程的操作)把它放入单独的线程处理 3)尽量用 Handler 来处理 UIthread 和别的 thread 之间的交互

    至于Service和BroadcastReceiver ,和上面的分析同理,这里就不多说了。

    3.如何分析ANR日志?

    先看个 LOG:

    04-01 13:12:11.572  I/InputDispatcher( 220): Application is not responding :Window{2b263310com.android.email/com.android.email.activity.SplitScreenActivitypaused=false}.  5009.8ms since event, 5009.5ms since waitstarted
    04-0113:12:11.572 I/WindowManager( 220): Input event dispatching timedout sending tocom.android.email/com.android.email.activity.SplitScreenActivity
    04-01  13:12:14.123 I/Process(  220): Sending signal. PID: 21404 SIG: 3--- 发生ANR 的时间和生成 trace.txt 的时间
    04-01 13:12:14.123 I/dalvikvm(21404):threadid=4: reacting to signal 3 
    ……
    04-0113:12:15.872 E/ActivityManager(  220): ANR in com.android.email(com.android.email/.activity.SplitScreenActivity)
    04-0113:12:15.872 E/ActivityManager(  220): Reason:keyDispatchingTimedOut
    04-0113:12:15.872 E/ActivityManager(  220): Load: 8.68 / 8.37 / 8.53
    04-0113:12:15.872 E/ActivityManager(  220):  CPUusage from 4361ms to 699ms ago   ---- CPU 在 ANR 发生前的使用情况
    04-0113:12:15.872 E/ActivityManager(  220):   5.5%21404/com.android.email: 1.3% user + 4.1% kernel / faults: 10 minor
    04-0113:12:15.872 E/ActivityManager(  220):   4.3%220/system_server: 2.7% user + 1.5% kernel / faults: 11 minor 2 major
    04-0113:12:15.872 E/ActivityManager(  220):   0.9%52/spi_qsd.0: 0% user + 0.9% kernel
    04-0113:12:15.872 E/ActivityManager(  220):   0.5%65/irq/170-cyttsp-: 0% user + 0.5% kernel
    04-0113:12:15.872 E/ActivityManager(  220):   0.5%296/com.android.systemui: 0.5% user + 0% kernel
    04-0113:12:15.872 E/ActivityManager(  220):  100%TOTAL: 4.8% user + 7.6% kernel + 87% iowait
    04-0113:12:15.872 E/ActivityManager(  220):  CPUusage from 3697ms to 4223ms later :-- ANR 后 CPU 的使用量
    04-0113:12:15.872 E/ActivityManager(  220):   25%21404/com.android.email: 25% user + 0% kernel / faults: 191 minor
    04-0113:12:15.872 E/ActivityManager(  220):    16% 21603/__eas(par.hakan: 16% user + 0% kernel
    04-0113:12:15.872 E/ActivityManager(  220):    7.2% 21406/GC: 7.2% user + 0% kernel
    04-0113:12:15.872 E/ActivityManager(  220):    1.8% 21409/Compiler: 1.8% user + 0% kernel
    04-0113:12:15.872 E/ActivityManager(  220):   5.5%220/system_server: 0% user + 5.5% kernel / faults: 1 minor
    04-0113:12:15.872 E/ActivityManager(  220):    5.5% 263/InputDispatcher: 0% user + 5.5% kernel
    04-0113:12:15.872 E/ActivityManager(  220):  32%TOTAL: 28% user + 3.7% kernel
    

    从ANR日志中,我们看到了ANR in com.android.email关键字,然后接下来,我们看到Reason:keyDispatchingTimedOut,是在事件响应里面超时了,但是具体在哪里,这里看不出来,但是这里可以看到一个信息,即:ANR的原因是CPU占用率高,任务得不到时间片去执行,还是因为IO密集,导致ANR,这个很重要,可以给我们之后分析具体的trace日志提供方向。

    除了看 LOG ,解决 ANR 还得需要 trace.txt 文件,如何获取呢?可以用如下命令获取

    $chmod 777 /data/anr
    $rm /data/anr/traces.txt
    $ps
    $kill -3  PID
    adb pull data/anr/traces.txt ./mytraces.txt
    

    从 trace.txt 文件,看到最多的是如下的信息:

    -----pid 21404 at 2011-04-01  13:12:14  -----   
    Cmdline: com.android.email 
    DALVIK THREADS: 
    (mutexes: tll=0tsl=0 tscl=0 ghl=0 hwl=0 hwll=0) 
    "main" prio=5 tid=1 NATIVE 
      | group="main" sCount=1 dsCount=0obj=0x2aad2248 self=0xcf70 
      | sysTid=21404 nice=0 sched=0/0cgrp=[fopen-error:2] handle=1876218976 
       atandroid.os.MessageQueue.nativePollOnce(Native Method)   atandroid.os.MessageQueue.next(MessageQueue.java:119)   atandroid.os.Looper.loop(Looper.java:110 ) 
     at android.app.ActivityThread.main(ActivityThread.java:3688) 
     at java.lang.reflect.Method.invokeNative(Native Method) 
      atjava.lang.reflect.Method.invoke(Method.java:507) 
      atcom.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:866)
     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:624) 
     at dalvik.system.NativeStart.main(Native Method)
    

    我们从这里看到,是主线程卡在了nativePollOnce这里,从之前小编分析过handler、MQ源码可知,这个是消息队列为空,在等待下一条消息入队,进行主线程唤醒,也就是说主线程死在,一直等待,等待超时了。

    小结: 如何调查并解决 ANR?

    1)首先分析 log,看一下大概原因,如果能直接定位最好,但是一般这里直接定位不到的

    2)从 trace.txt 文件查看调用 stack.

    3)对照每个线程日志,看自己代码

    4)过程中,紧紧穿插一条主线,仔细查看 ANR 的成因( iowait?block?memoryleak? )

    4.案例分析

    4.1 IO wait举例

    Process:com.android.email 
    Activity:com.android.email/.activity编程客栈.MessageView Subject:
    keyDispatchingTimedOut CPU usage from 2550ms to -2814ms ago: 5%187
    /system_server: 3.5% user + 1.4% kernel 
    / faults: 86 minor 20major 4.4% 1134/com.android.email: 0.7% user + 3.7% kernel 
    /faults: 38 minor 19 major 4% 372/com.android.eventstream: 0.7%user + 3.3% kernel 
    / faults: 6 minor 1.1% 272/com.android.phone:0.9% user + 0.1% kernel 
    / faults: 33 minor 0.9%252/com.android.systemui: 0.9% user + 0% kernel 0%409/com.android.eventstream.telephonyplugin: 0% user + 0% kernel 
    /faults: 2 minor 0.1% 632/com.android.devicemonitor: 0.1% user + 0%kernel 
    100%TOTAL: 6.9% user + 8.2% kernel + 84%iowait 
    -----pid 1134 at 2010-12-17 17:46:51 -----
     Cmd line:com.android.email 
    DALVIK THREADS: (mutexes: tll=0 tsl=0tscl=0 ghl=0 hwl=0 hwll=0) 
    "main" prio=5 tid=1 WAIT |group="main" sCount=1 dsCount=0 obj=0x2aaca180self=0xcf20 | sysTid=1134 nice=0 sched=0/0 cgrp=[fopen-error:2]handle=1876218976 at java.lanphpg.Object.wait(Native Method) -waiting on <0x2aaca218> (a java.lang.VMThread) 
    atjava.lang.Thread.parkFor(Thread.java:1424) 
    atjava.lang.LangAccessImpl.parkFor(LangAccessImpl.java:48) 
    atsun.misc.Unsafe.park(Unsafe.java:337) 
    atjava.util.concurrent.locks.LockSupport.park(LockSupport.java:157) 
    atjava.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:808) 
    atjava.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:841) 
    atjava.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1171) 
    atjava.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:200) 
    atjava.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:261) 
    atandroid.database.SQLite.SQLiteDatabase.lock(SQLiteDatabase.java:378) 
    atandroid.database.sqlite.SQLiteCursor.<init>(SQLiteCursor.java:222) 
    atandroid.database.sqlite.SQLiteDirectCursorDriver.query(SQLiteDirectCursorDriver.java:53) 
    atandroid.database.sqlite.SQLiteDatabase.rawQueryWithFactory(SQLiteDatabase.java:1356) 
    atandroid.database.sqlite.SQLiteDatabase.queryWithFactory(SQLiteDatabase.java:1235) 
    atandroid.database.sqlite.SQLiteDatabase.query(SQLiteDatabase.java:1189) 
    atandroid.database.sqlite.SQLiteDatabase.query(SQLiteDatabase.java:1271) 
    atcom.android.email.provider.EmailProvider.query(EmailProvider.java:1098) 
    atandroid.content.ContentProvider$Transport.query(ContentProvider.java:187) 
    atandroid.content. ContentResolver.query (ContentResolver.java:268) 
    atcom.android.email.provider.EmailContent$Message.restoreMessageWithId(EmailContent.java:648) 
    atcom.android.email.Controller.setMessageRead(Controller.java:658) 
    atcom.androi开发者_JS教程d.email.activity.MessageView.onMarkAsRead(MessageView.java:700) 
    atcom.android.email.activity.MessageView.access$2500(MessageView.java:98) 
    atcom.android.email.activity.MessageView$LoadBodyTask .onPostExecute(MessageView.java:1290) 
    atcom.android.email.activity.MessageView$LoadBodyTask.onPostExecute(MessageView.java:1255) atandroid.os.AsyncTask.finish(AsyncTask.java:417) 
    atandroid.os.AsyncTask.access$300(AsyncTask.java:127) 
    at android.os.AsyncTask $InternalHandler.handleMessage (AsyncTask.java:429) 
    atandroid.os.Handler.dispatchMessage(Handler.java:99) 
    atandroid.os.Looper.loop(Looper.java:123) 
    atandroid.app.ActivityThread.main(ActivityThread.java:3652) 
    atjava.lang.reflect.Method.invokeNative(Native Method) 
    atjava.lang.reflect.Method.invoke(Method.java:507) 
    atcom.android.internal.os.ZygoteIn编程客栈
    

    我们从日志上,看到关键字84%iowait,而且对于cpu来说,这段日志占用很少,说明大概原因就是IO密集型导致,从下面日志中看到,有View、有ContentProvider、有Sqlite、有锁,我们先大胆猜测一下,是否是在主线程里面,进行了数据库操作导致的?

    具体看日志,我们代码中应该有在view内部调用ContentResolver的地方,快搜一下,我们代码中,是否有这样的调用,果然一搜,的确有:

             final Message message = Message . restoreMessageWithId (mProviderContext , messageId );
             if ( message == null ) {
                return ;
             }
             Account account = Account . restoreAccountWithId ( mProviderContext ,message . MACcountKey );
             if ( account == null ) {
                return ; //isMessagingController returns false for null, but let's make itclear.
             }
             if ( isMessagingController ( account )) {
                new Thread () {
                    @Override
                    public void run () {
                       mLegacyController . processPendingActions ( message .mAccountKey );
                    }
                }. start ();
             }
    

    发现问题了没有呢?这里在主线程中调用了Account . restoreAccountWithId ( mProviderContext ,message . mAccountKey );去查询数据,如果正常系统资源不紧张的时候,这样调用的确不会出现太大问题,但是假设这个数据很大、或者系统当前IO繁忙,这儿代码不就执行很慢,导致主线程事件处理超时了吗? 所以把代码,简单改为如下,运行没有问题了。

    new Thread() {
             final Message message = Message . restoreMessageWithId (mProviderContext , messageId );
             if ( message == null ) {
                return ;
             }
             Account account = Account . restoreAccountWithId ( mProviderContext ,message . mAccountKey );
             if ( account == null ) {
                return ; //isMessagingController returns false for null, but let's make itclear.
             }
             if(isMessagingController(account)) {
                      mLegacyController.processPendingActions(message.mAccountKey);
                }
    }.start();
    

    4.2Memoryleak/Thread leak

    11-1621:41:42.560 I/ActivityManager( 1190): 
    ANR in process:android.process.acore (last in android.process.acore) 11-1621:41:42.560 
    I/ActivityManager( 1190): Annotation:keyDispatchingTimedOut 11-16 21:41:42.560 I/ActivityManager(1190): CPU usage: 11-16 21:41:42.560 
    I/ActivityManager( 1190):Load: 11.5 / 11.1 / 11.09 11-16 21:41:42.560 
    I/ActivityManager(1190): CPU usage from 9046ms to 4018ms ago: 11-16 21:41:42.560
    I/ActivityManager( 1190): d.process.acore:98% = 97% user + 0% kernel / faults: 1134 minor 11-16 21:41:42.560
    I/ActivityManager( 1190): system_server: 0% = 0% user + 0% kernel /faults: 1 minor 11-16 21:41:42.560 
    I/ActivityManager( 1190): adbd:0% = 0% user + 0% kernel 11-16 21:41:42.560 
    I/ActivityManager(1190): logcat: 0% = 0% user + 0% kernel 11-16 
    21:41:42.560
    I/ActivityManager( 1190):  TOTAL:100% = 98% user + 1% kernel
    Cmdline: android.process.acore 
    DALVIK THREADS: "main"prio=5 tid=3  VMWAIT |group="main" sCount=1 dsCount=0 s=N obj=0x40026240self=0xbda8 | sysTid=1815 nice=0 sched=0/0 cgrp=unknownhandle=-134编程客栈4001376 
    atdalvik.system.VMRuntime.trackExternalAllocation (NativeMethod )
    atandroid.graphics.Bitmap.nativeCreate(Native Method) 
    atandroid.graphics.Bitmap.createBitmap (Bitmap.java:468) 
    atandroid.view.View.buildDrawingCache(View.java:6324) 
    atandroid.view.View.getDrawingCache(View.java:6178) 
    atandroid.view.ViewGroup.drawChild(ViewGroup.java:1541)
    atcom.android.internal.policy.impl.PhoneWindow$DecorView.draw(PhoneWindow.java:1830) atandroid.view.ViewRoot.draw(ViewRoot.java:1349) php
    atandroid.view.ViewRoot.performTraversals(ViewRoot.java:1114) 
    atandroid.view.ViewRoot.handleMessage(ViewRoot.java:1633) 
    atandroid.os.Handler.dispatchMessage(Handler.java:99) 
    atandroid.os.Looper.loop(Looper.java:123) 
    atandroid.app.ActivityThread.main(ActivityThread.java:4370) 
    atjava.lang.reflect.Method.invokeNative(Native Method) 
    atjava.lang.reflect.Method.invoke(Method.java:521) 
    atcom.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:868) 
    atcom.android.internal.os.ZygoteInit.main(ZygoteInit.java:626) 
    atdalvik.system.NativeStart.main(Native Method)
    "Thread-408"prio=5 tid=329 WAIT |group="main" sCount=1 dsCount=0 s=N obj=0x46910d40self=0xcd0548 | sysTid=10602 nice=0 sched=0/0 cgrp=unknownhandle=15470792 
    at java.lang.Object.wait(Native Method) -waiting on <0x468cd420> (a java.lang.Object) 
    atjava.lang.Object.wait(Object.java:288) 
    atcom.android.dialer.CallLogContentHelper$UiUpdaterExecutor$1.run(CallLogContentHelper.java:289) 
    atjava.lang.Thread.run(Thread.java:1096)
    

    我们看到日志中,ANR发生在VMRuntime中,有关键字Bitmap、createBitmap 、nativeCreate、ViewRoot、ActivityThread#main,我们大胆猜测一下,是否是在主线程视图绘制函数中,进行了大图片的加载、申请了bitmap没有释放导致,仔细看日志,发现at dalvik.system. VMRuntime.trackExternalAllocation (NativeMethod ) ,bitmap在申请内存的时候,不够了,这时block了。

    解决很简单,这时根据后面的线程、进程、堆栈详细信息,去反向猜测&查找相关代码,是否存在可能内存泄露的地方。

    以上就是Android ANR无响应分析解决方案的详细内容,更多关于Android ANR无响应的资料请关注我们其它相关文章!

    0

    精彩评论

    暂无评论...
    验证码 换一张
    取 消