-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathhow-to-solve-any-ios-crash-ever.html
297 lines (256 loc) · 20 KB
/
how-to-solve-any-ios-crash-ever.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
<!DOCTYPE html>
<html lang="en">
<head>
<script src="https://use.fontawesome.com/afd448ce82.js"></script>
<!-- Meta Tag -->
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<!-- SEO -->
<meta name="author" content="Bruno Rocha">
<meta name="keywords" content="Software, Engineering, Blog, Posts, iOS, Xcode, Swift, Articles, Tutorials, OBJ-C, Objective-C, Apple">
<meta name="description" content="Ever had a crash in which you had absolutely no idea what was going on, and no amount of testing allowed you to reproduce the issue? If so, you've come to the right place!">
<meta name="title" content="How To Solve Any iOS Crash Ever">
<meta name="url" content="https://swiftrocks.com/how-to-solve-any-ios-crash-ever">
<meta name="image" content="https://swiftrocks.com/images/thumbs/thumb.jpg?4">
<meta name="copyright" content="Bruno Rocha">
<meta name="robots" content="index,follow">
<meta property="og:title" content="How To Solve Any iOS Crash Ever"/>
<meta property="og:image" content="https://swiftrocks.com/images/thumbs/thumb.jpg?4"/>
<meta property="og:description" content="Ever had a crash in which you had absolutely no idea what was going on, and no amount of testing allowed you to reproduce the issue? If so, you've come to the right place!"/>
<meta property="og:type" content="website"/>
<meta property="og:url" content="https://swiftrocks.com/how-to-solve-any-ios-crash-ever"/>
<meta name="twitter:card" content="summary_large_image"/>
<meta name="twitter:image" content="https://swiftrocks.com/images/thumbs/thumb.jpg?4"/>
<meta name="twitter:image:alt" content="Page Thumbnail"/>
<meta name="twitter:title" content="How To Solve Any iOS Crash Ever"/>
<meta name="twitter:description" content="Ever had a crash in which you had absolutely no idea what was going on, and no amount of testing allowed you to reproduce the issue? If so, you've come to the right place!"/>
<meta name="twitter:site" content="@rockbruno_"/>
<!-- Favicon -->
<link rel="icon" type="image/png" href="images/favicon/iconsmall2.png" sizes="32x32" />
<link rel="apple-touch-icon" href="images/favicon/iconsmall2.png">
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Source+Sans+3:ital,wght@0,200..900;1,200..900&display=swap" rel="stylesheet">
<!-- Bootstrap CSS Plugins -->
<link rel="stylesheet" type="text/css" href="css/bootstrap.css">
<!-- Prism CSS Stylesheet -->
<link rel="stylesheet" type="text/css" href="css/prism4.css">
<!-- Main CSS Stylesheet -->
<link rel="stylesheet" type="text/css" href="css/style48.css">
<link rel="stylesheet" type="text/css" href="css/sponsor4.css">
<!-- HTML5 shiv and Respond.js support IE8 or Older for HTML5 elements and media queries -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://swiftrocks.com/how-to-solve-any-ios-crash-ever"
},
"image": [
"https://swiftrocks.com/images/thumbs/thumb.jpg"
],
"datePublished": "2021-11-01T14:00:00+02:00",
"dateModified": "2021-11-01T14:00:00+02:00",
"author": {
"@type": "Person",
"name": "Bruno Rocha"
},
"publisher": {
"@type": "Organization",
"name": "SwiftRocks",
"logo": {
"@type": "ImageObject",
"url": "https://swiftrocks.com/images/thumbs/thumb.jpg"
}
},
"headline": "How To Solve Any iOS Crash Ever",
"abstract": "Ever had a crash in which you had absolutely no idea what was going on, and no amount of testing allowed you to reproduce the issue? If so, you've come to the right place!"
}
</script>
</head>
<body>
<div id="main">
<!-- Blog Header -->
<!-- Blog Post (Right Sidebar) Start -->
<div class="container">
<div class="col-xs-12">
<div class="page-body">
<div class="row">
<div><a href="https://swiftrocks.com">
<img id="logo" class="logo" alt="SwiftRocks" src="images/bg/logo2light.png">
</a>
<div class="menu-large">
<div class="menu-arrow-right"></div>
<div class="menu-header menu-header-large">
<div class="menu-item">
<a href="blog">blog</a>
</div>
<div class="menu-item">
<a href="about">about</a>
</div>
<div class="menu-item">
<a href="talks">talks</a>
</div>
<div class="menu-item">
<a href="projects">projects</a>
</div>
<div class="menu-item">
<a href="software-engineering-book-recommendations">book recs</a>
</div>
<div class="menu-item">
<a href="games">game recs</a>
</div>
<div class="menu-arrow-right-2"></div>
</div>
</div>
<div class="menu-small">
<div class="menu-arrow-right"></div>
<div class="menu-header menu-header-small-1">
<div class="menu-item">
<a href="blog">blog</a>
</div>
<div class="menu-item">
<a href="about">about</a>
</div>
<div class="menu-item">
<a href="talks">talks</a>
</div>
<div class="menu-item">
<a href="projects">projects</a>
</div>
<div class="menu-arrow-right-2"></div>
</div>
<div class="menu-arrow-right"></div>
<div class="menu-header menu-header-small-2">
<div class="menu-item">
<a href="software-engineering-book-recommendations">book recs</a>
</div>
<div class="menu-item">
<a href="games">game recs</a>
</div>
<div class="menu-arrow-right-2"></div>
</div>
</div>
</div>
<div class="content-page" id="WRITEIT_DYNAMIC_CONTENT">
<!--WRITEIT_POST_NAME=How To Solve Any iOS Crash Ever-->
<!--WRITEIT_POST_HTML_NAME=how-to-solve-any-ios-crash-ever-->
<!--Add here the additional properties that you want each page to possess.-->
<!--These properties can be used to change content in the template page or in the page itself as shown here.-->
<!--Properties must start with 'WRITEIT_POST'.-->
<!--Writeit provides and injects WRITEIT_POST_NAME and WRITEIT_POST_HTML_NAME by default.-->
<!--WRITEIT_POST_SHORT_DESCRIPTION=Ever had a crash in which you had absolutely no idea what was going on, and no amount of testing allowed you to reproduce the issue? If so, you've come to the right place!-->
<!--DateFormat example: 2021-11-02T14:00:00+02:00-->
<!--WRITEIT_POST_SITEMAP_DATE_LAST_MOD=2021-11-01T14:00:00+02:00-->
<!--WRITEIT_POST_SITEMAP_DATE=2021-11-01T14:00:00+02:00-->
<title>How To Solve Any iOS Crash Ever</title>
<div class="blog-post">
<div class="post-title-index">
<h1>How To Solve Any iOS Crash Ever</h1>
</div>
<div class="post-info">
<div class="post-info-text">Published on 01 Nov 2021</div>
</div>
<p><i>Closed: Cannot Reproduce</i></p>
<p>Ever had a crash in which you had absolutely no idea what was going on, and no amount of testing allowed you to reproduce the issue? If so, you've come to the right place!</p>
<div class="sponsor-article-ad-auto hidden"></div>
<p>Well, sort of. As you'll see in this article, the ability to debug complex crashes is not something immediate. Keep this out of your expectations: there's no magical instrument that you forgot to run that will give you the output you're expecting. When it comes to complex crashes, what we need to do instead is <i>prepare</i> our environment so that these issues are better understood when they arrive, making them more actionable. Let's see how to do that!</p>
<h2>What are complex crashes?</h2>
<p>One thing that I find helpful is to rationalize the issue. It's easy to look at a weird issue and just dismiss it as something magical that will never again, but that makes no sense. There's always a perfectly logical reason the issue happened (most likely your fault), and the more users are affected by it, the more likely it's that this is not a freak accident. So how come you might be looking right now at an issue that affects a high amount of users, and still you have no idea what's going on or how to reproduce it?</p>
<p>In my experience, the inability to understand a crash will always boil down to <b>a lack of information</b>. The problem is <i>never</i> that the issue is "too complicated", <b>but that you don't have enough data.</b> Think about the weirdest crash you ever had to look at: wouldn't it be a lot less complicated if the crash report told you exactly what the issue was and how to solve it? It doesn't matter how bizarre a crash is, it's your ability to understand and reproduce the issue which dictates how likely it's to be solved.</p>
<p>Thus, if you want to be able to solve any crash ever, you need to enhance the data that accompanies them. Let's take a look at a couple of ways do to that!</p>
<h2>App-specific metadata</h2>
<p>You might have noticed that crash platforms like Firebase will always include some useful pieces of device-related metadata on your crashes, such as the most common iOS version causing the crash, if the users were in foreground or background, if the devices are jailbroken, how much disk space each user had left when the crash happened, and so on. These are extremely useful, but are not nearly enough. What you truly need here is to include metadata of <b>your app</b> which helps you pinpoint what the user was doing at the moment of the crash. Some examples of things you should add are:</p>
<ul>
<li>The screen the user was looking at</li>
<li>The "type" of the user, if applicable (free? premium? logged out?)</li>
<li>The last action the user did (did they try to navigate somewhere?)</li>
<li>Did the app finish launching correctly?</li>
<li>Did the user receive a memory warning?</li>
<li>Is the app shutting down?</li>
<li>Does the user have an active internet connection?</li>
<li>Which language is the user seeing?</li>
</ul>
<p>It's hard to provide a complete list given that this will be completely different from app to app, but what you need to do here is essentially assemble everything that you can think about your app that can make a difference in its execution and include it to your crash reports.</p>
<p>You can add this information to Firebase through its SDK's key/value pairs API, but in order to see this information as percentages you will probably have to abstract Firebase under your own crash reporting backend.</p>
<h2>Analytics</h2>
<p>In addition to metadata, another critical component is to have a solid analytics infrastructure in your app. This is something that most apps might already include, though you might require some changes to make it useable for crash reporting purposes.</p>
<p>The point here is that if you have a solid analytics implementation, you should be able to use it to replay a users steps to the crash. Thus, for this to happen, you need to make sure your analytics SDK is receiving as much information as possible regarding user interactions like:</p>
<ul>
<li>"User touched button X"</li>
<li>"User saw banner Y"</li>
<li>"User navigated to screen Z"</li>
</ul>
<p>For the replayability itself, most third-party SDKs nowadays include a "timeline" feature that shows you all the events sent by a particular user around a specific time.</p>
<h2>Using this information to solve crashes</h2>
<p>Finally, with your crashes receiving as much information as possible about the state of the app at the moment of the crash, you can now follow this step-by-step guide I made that should help you track down and solve the great majority of cases!</p>
<h3>Check the crashed thread</h3>
<p>If you're reading this guide it probably means that you already tried this and it didn't work, but it's good to mention anyway that for the huge majority of cases the answer lies directly in the crashed trace. By checking the path the code took, you may be able to locate and reproduce the issue.</p>
<h3>Check the metadata for the crash</h3>
<p>If the trace is vague, then looking at your added metadata may reveal the issue. When looking at the metadata, pay attention to values that are close to either 100% or 0%. This may reveal that the crash is tied to a very specific device or condition inside the app.</p>
<h3>Check the background threads of the crash</h3>
<p>If the metadata is <i>also</i> vague, then it may mean that the crashed code is not the problem itself, but more of an indirect consequence of a problem that happened asynchronously somewhere else. In this situation, you may be able to locate the issue by looking at what's happening in the other threads of the crash. Try grabbing many occurrences of the crash and compare their threads with each other. Do they all have something in common that you don't see in other issues? If so, that could be the cause of the problem.</p>
<h3>Match the environment of the crashing users</h3>
<p>If after deeply analyzing the trace and metadata you still can't figure out what's going on, it may be the case that the crash is tied to a specific device and/or AB test. If that's the case, then you should be able to reproduce the crash by matching the user's environment. Besides making sure to use the exact phone/OS version that the user experienced the crash on, make sure that you're also matching the user's AB testing flags (if your app has them).</p>
<p>Regarding flags, one very useful thing to do is to compare the flags of a list of users with the issue against those of a list of users <b>without</b> the issue. If the issue is connected to a flag, then compiling a list of common flags in these groups will reveal which flag (or lack of, if the issue was caused by removing an experiment) is causing the problem.</p>
<p>EDIT: <a href="https://twitter.com/daveverwer">Dave Verwer</a> also mentioned something important that I forgot to add -- make sure to also run the exact build of the app that the users are crashing on! It's not unlikely for the changes on your branch to affect the conditions for the crash, so always make sure you're on the exact commit the build was archived on. You can gain this ability by making your CI create a git tag every time it uploads a new build -- by naming the tag with the correspondent build number, you'll have the power to rollback to any release you've ever created.</p>
<h3>Retrace the user's steps</h3>
<p>If everything proved to be useless, you should be able to reproduce the issue by mimicking what the users are doing before the crash happens. This can sometimes be something absurdly specific like opening/closing the app a couple of times, turning it upside-down, opening a playlist and then throwing the device against a wall, and if that's the case, then you should be able to find these steps by looking at your analytics SDK's timeline for that user.</p>
<p>It's important to note that it's possible that the conditions to trigger the issue can span <b>multiple</b> sessions, like an issue that involves content that was downloaded a couple of days ago. In cases like that, understanding the issue requires looking not only at the data of the session where the crash happened, but also of the sessions that came before it.</p>
<h3>Instrument for thread / memory safety issues</h3>
<p>If you <b>still</b> can't figure out what's happening, then you may be dealing with a non-deterministic issue caused by either thread safety issues such as race conditions or memory issues like heap corruption. There's unfortunately no easy way of figuring these out, and you'll need to have a deep understanding of the code to catch them. In iOS, the Zombies instrument and the thread/memory sanitizers can be of some help.</p>
<p>The best thing you can do here is to prevent these from being possible to happen in the first place. If you're working with asynchronous code, always be 100% sure that your implementation is thread-safe for all its usage scenarios before merging it. While thread-related issues are very easy to introduce, they're extremely hard to debug. Choose to always be on the safer side to avoid issues like this in the future.</p>
<h3>Check what was introduced in the release the crash started</h3>
<p>In some cases, especially very old issues, it can be helpful to track down the exact version the issue started happening and hop into GitHub to see what exactly was introduced in that release. If nothing worked, then reverting suspicious pull requests could do the trick.</p>
<h3>Add more logs</h3>
<p>In the event that you have absolutely no clue what's happening, then adding additional logs could provide some relief. Firebase allows you to attach generic logs to a crash report, and one way you can use them is to log information about the state of the user's app around the place the crash happens. Try to think of anything weird or unintentional that can happen around the code that is crashing and log it to Firebase -- in the next release, you'll be able to see them alongside the crashes. They can also be useful even before you push a new feature; if you think a new feature could cause issues, you can already safeguard it with logs before it's even released. In the event that it does cause an issue, you'll already have the additional information you need to debug it.</p>
<h3>What else?</h3>
<p>If you reach this far, then it's possible for your initial thoughts might be true: you're dealing with some bizarre hardware problem caused by the sun's radiation at a specific time of the day for a specific user in Latvia.</p>
<div class="sponsor-article-ad-auto hidden"></div>
<p>To avoid situations like this, I personally try to completely avoid looking into issues until they are consistently happening for a sufficiently large amount of users. It's always possible for issues to be caused by situational things, but unless they are consistent or of high impact, it's probably best to ignore them to avoid the possibility of wasting your time looking into something that turns out to not be your fault.</p>
</div>
</div>
<div class="blog-post footer-main">
<div class="footer-logos">
<a href="https://swiftrocks.com/rss.xml"><i class="fa fa-rss"></i></a>
<a href="https://twitter.com/rockbruno_"><i class="fa fa-twitter"></i></a>
<a href="https://github.com/rockbruno"><i class="fa fa-github"></i></a>
</div>
<div class="footer-text">
© 2025 Bruno Rocha
</div>
<div class="footer-text">
<p><a href="https://swiftrocks.com">Home</a> / <a href="blog">See all posts</a></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<!-- Blog Post (Right Sidebar) End -->
</div>
</div>
</div>
<!-- All Javascript Plugins -->
<script type="text/javascript" src="js/jquery.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js" integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM" crossorigin="anonymous"></script>
<script type="text/javascript" src="js/prism4.js"></script>
<!-- Main Javascript File -->
<script type="text/javascript" src="js/scripts30.js"></script>
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-H8KZTWSQ1R"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-H8KZTWSQ1R');
</script>
</body>
</html>