Flutter mobile app code audit

Categories: Development

Daria Orlova

Oct 07, 2022

•

11 min

What is code audit

According to Wikipedia:

A software code audit is a comprehensive analysis of source code in a programming project with the intent of discovering bugs, security breaches, or violations of programming conventions.

Code audit can be:

Performed as a service of its own, in order to determine the current state of the app and a roadmap for future development based on the findings.
The first step when inheriting the product and planning its future work.

It is important to note that the roadmap depends a lot on the goal of the client requesting the code audit. Even though a lot of the times it may seem like just throwing away the codebase and starting from scratch is a great idea, in reality, a more pragmatic approach is often a more productive one 😉

Client goals for code audit

As I said, a lot depends on the goals of the client:

If the app is in a tolerable state, and the goal is to just launch the MVP (Minimum Viable Product) ASAP to get first user feedback (or please the investors 😅) — it makes sense to prioritize tasks related to security and release, leaving performance issues for the next iteration.
If time & money are not an obstacle, and the goal is to release an amazing product, then it might be more productive to start from scratch. Again, depending on the app state, because in this case “tolerable” may not be enough, and starting from scratch may be faster and better than fixing all of the issues.
If it is already an existing product and an immediate full rewrite is not justified from the product perspective, it is very important to plan the roadmap carefully for a healthy combination of new feature development & fixing of the existing issues.

There are trade-offs for every goal & it’s crucial to understand them so that the results can meet expectations.

What does a technical code audit consist of?

Now, it’s important to set expectations here. What I will be describing is the technical code audit that can be performed by one developer (given the size of the codebase is sane for one person to handle). It does not intend to cover things like full functionality testing or a thorough system security audit, because these are services of their own. The points mentioned here are based on auditing a Flutter app codebase, but most of them are easily transferable to native apps too.

Code quality & maintainability

Group 28.png

Source control tools

It all starts with the project's location. Is it checked into a source control system? If it is, is the commit history readable? If the answer to any of these questions is “no”, then it will be harder to answer the “why’s” found in the codebase.

README

When you open a project, the first thing that you see is the README file. And in my opinion, this file should be enough to understand how to build the project: what configs to specify as build parameters, how to generate any code if you use code generation in the project, and so on. If the README is empty, it will be harder for me to figure out how to compile the project and build it for the correct environment.

Flavors & environments

This leads me to check whether the app has any flavors set up. If it’s already in production, I’d assume it has at least a staging environment, otherwise, it may be tricky to further test it. If it’s still under development, then flavoring is less critical, yet still, a task that will go onto the list of improvements. But even if there are no flavors at the moment, I check how easy it would be to add them and how much of the “flavoring” is baked into the codebase as opposed to extracted into a config.

Code formatting

Code formatting is important. It is like grammar & syntax, people agree on it in order to understand each other consistently. The messier it is, the harder it is to navigate it. Also if the developer doesn’t format their code, do they even care about it at all? 😅 And what awaits in the code itself?…

Project structure, architecture & separation of concerns

There are many approaches to architecting your app. And in general, I think that there is no size that fits all. So when I evaluate the app structure, these are some of the questions that I answer:

How easy is it to navigate it? How intuitive are the directories? Do they actually contain what they imply & ONLY contain what they imply?
If I had to start working on a feature right away, do I know where I would put the widgets? The API calls? The business logic?
Are these concerns clearly separated, or all mixed in together? Do UI changes need to touch business logic changes?
How many extra changes would I need to do if I changed an API response model? Is it encapsulated, or do I need to make tweaks in several places?
If I had to fix a bug related to feature X, is it easy to define the scope of feature X?
Is there a lot of generic utils/helpers, etc.?
Is there a clear source of truth for the state? A widget, a bloc, a provider, a controller, a repository? Or is it all over the place?
Is there a global app state, how is it shared & maintained?
If there is an architecture pattern implemented, is it actually implemented correctly? If I continue working on this project, will I be able to follow this pattern consistently?

Data models

Data models are an integral part of the application, be it for modeling API responses, database entities, or any other business logic models. Hence for productive work with them, there are a few things I care about:

Equality implementation. Objects in Dart are equal by default only if they’re literally the same instances, which is a property not suitable for data models. I worked on a project where object equality was nonexistent and all comparing was done just by the id fields. It was a nightmare to work with and to write sane tests with this approach. This also goes hand in hand with data model immutability. So if I don’t see an equatable or freezed dependency, I become alerted 😂
Serialization. Are network requests/response models strongly typed? Is JSON serialization encapsulated into toJson / fromJson or smeared all over the call sites? Are any codegen-based tools that reduce potential human errors used? Are these models used as-is or mapped to domain models?

App testability & refactoring

Refactoring is an essential part of development, and its necessity is tightly coupled with codebase quality. Ironically, the worse the quality, the harder it is to refactor.

The ultimate factor of safe refactoring are automated tests. Do they even exist? What parts of the app do they cover? Are they useful? Are there any failing tests? These are all nice questions I have yet to encounter because in my experience this list ends with a “no” from the very first question 🥲😂
The next thing I look at is dependency injection. If it exists, then there is at least a chance of adding tests without refactoring everything beforehand.
Localization. There are a couple of things that can go wrong:
- Even if there is only one language that the app supports (at the moment! what about the future?), the localizations should never be hard coded and spread all over the app. The bigger the app, the more of a disaster it is to collect & extract all of these strings into a single, coordinated place. Using a localization tool from the start is a decision you won’t regret. Even if the language stays the same forever, it is still easier to manage, change & reuse the strings if they’re organized in a cohesive manner.
- A little less bad, but still not an ideal approach is hard-coded keys. These are flaky & human-error prone, as well as not safely reusable, so if that’s a chosen pattern, it should also be considered for refactoring.
Other visual resources. The same goes for accessing static app resources such as images, icons &, etc.
The environment config: flutter & dart versions against which the codebase has been built, the major library versions that it uses, is it sound null safe. The less it’s up to date, the bigger the possibility of breaking changes during migration. Splatter it on top of 0% tests and you get yourself a recipe for a refactoring disaster.

Dependencies

One common mistake that I see more often than I would like to is specifying dependencies that should be in dev dependencies, such as code generators & testing related. According to the docs:

Using dev dependencies makes dependency graphs smaller. That makes pub run faster and makes it easier to find a set of package versions that satisfies all constraints.

Open source software is amazing, but not all publicly available software, including libraries, is free. It’s important to check the license of the package you’re depending on because it might have conditions that you don’t comply with.
“The best dependency is no dependency” is a great rule of thumb, yet not always feasible. Or, feasible, of course, but then your product will take much more time to develop and that is rarely the goal. Yet not all libraries are created equal and depending on one, makes it part of your codebase. And you should be mindful of what you put there. In any form. So before committing to a library, take a look at these things:
- The source code, obviously. Does it live up to your standards? Does it solve what it promises?
- Is the library popular among developers?
- How well is it managed? Do issues & pull requests get resolved regularly? Is it actively maintained and releasing updates?
- Is there good documentation? Examples?
- Are there many transitive dependencies? What are they?
- If you’re using Github, then the good news is that now it offers support for supply chain security for Dart & Flutter apps and you can learn about packages security issues with those tools.

Persisted storage

Persisted storage comes in different flavors: there are simple key-value data, then there are databases (SQL & NoSQL), and there is also work with the file system.

It is important to understand:

What kind of data is stored?
Why is it stored & does it actually need to be stored on the device?
Is it stored in the correct format/place?
How often can its schema change?
How are the migrations handled?
Can the app recover from data corruption?
How are queries formed? Are they performant? Are there any N+1 problems?
If files were required for temporal use, are they cleaned up correctly? If they’re required for persistent storage, are they saved in the correct place?

I have seen cases when a local database was supposedly used for caching, but it was only writing data and never reading it, hence doing useless work. Cases, when models that can change their schema often were saved in plain text format, making it impossible to do sane migrations without falling back to destructive. And cases, when unhandled restoration from data corruption ended up in the app crashing. Data persistence on devices is a huge topic and involves general CS knowledge and careful planning when used to a big extent, so it’s important to pay attention to it.

General code quality & maintainability

Does the project in general follow the official guidelines & recommendations, as well as community-approved approaches and patterns? If not, are the “deviations” justified (i.e. custom state management solution, that is used consistently and is adapted to the app requirements)?
Dart has an amazing feature of sound null safety. The only bad thing about it is that it was introduced after Flutter apps were already in full-speed development. It’s crucial to migrate the codebase to null-safety, and specifically to sound null-safety. If that is not the case for the codebase under audit, this would be on the list of top priorities. Even though the initial migration can be painful, take time & break things— it is the necessary pain required for further development to be productive.
Static code analysis. Are there any issues reported by the analyzer tools? Ideally, there should be 0 error & warning level issues reported. Are there any warnings explicitly ignored? Is an explanation for an ignored warning provided via comment?
Unused code. It just bloats the codebase and makes it harder to navigate it and understand what is relevant and what is not. To find unused code & files you can use automated tools, such as Dart Code Metrics. If you don’t need the code - delete it. If you think you might need it in the future, you can always come back to it via source control history. If you don’t have source control enabled, then I’m not sure what to say.
Duplicated code. I always have questions when I see exactly the same code for the 3rd time in the codebase, doing exactly the same logical thing and needing to be updated in all of these places in case of changes… I feel like writing exactly the same thing for the Nth time is a great cue that this code can be extracted & reused, but I learned this is not always the case. So it makes sense to run tools on the codebase that find the exact amount of duplicated lines.
And various other basic things like how readable is the code in general, bloated methods, classes, files &, etc., magic numbers, reflection, used design & architecture patterns, code comments, and so on.

Design system

A design system is a collection of reusable components, usually created by designers in a dedicated tool, for example, Figma. And even if there is no design system specified explicitly, I’m pretty sure that there are still common colors, text styles, and basic elements such as buttons, text fields & checkboxes reused across the screen designs. I understand that this is not always something that a developer can influence, yet what they can do is control how they implement designs in the app. So what I take note of is this:

Are basic design components like color palette & typography used consistently across the codebase, via Theme or custom approaches, or just scattered around and hardcoded? How hard would it be to add another theme to the app?
Are widgets reused? If the main button color changed, would I need to apply changes in all of the places where the button is present or just in the base component?
Are bigger widgets split into components and it’s easy to navigate them, or are they one big blob with a scary level of nesting?

Performance & production ready

Group 30.png

The next perspective from which I review the codebase & the product itself is the general performance of the app and how ready it is for production.

UI performance

How is the stateful widget lifecycle handled? Is it respected, e.g. are there any calls to access the state when it is no longer mounted?
Does it dispose of the resources that it used correctly? Are there any potential memory leaks?
Does it invoke unnecessary rebuilds in general or of parts of the tree that hasn’t changed?
Are there any frame drops due to too much work during the build phase? Is there anything going on, like calculations or network requests in the build methods?
Is there any jank in general and what is its cause? How smooth is the overall app experience?
Are the visual resources, such as PNGs, optimized for the device specifications? Are network images cached, are there any loading & error placeholders?
How are potentially recyclable components, such as ListView handled? Are they loaded lazily? Or, for example, do they use shrinkWrap/layout everything eagerly in a Column?
What is the state of the responsiveness? How does it look on various device sizes, OS versions & other specs?

User experience & journey

How is error handling & reporting done? Will you find out if your app has crashed in production, or if something is not working as expected? Are the exceptions swallowed, or reported via tools like Crashlytics / Sentry?
Are analytics enabled? Are the events logged consistently & meaningfully? Is any sensitive information logged that should be specified in the data consent?
In general, how obvious is it for the user what is the state of the app - loading, empty, content, error, and do they know what they can do about it?

I/O operations

Even if UI is running smoothly, it can still be generally slow and it’s usually related to problems in I/O operations, so the check includes:

If there is a local database involved, then analysis of its structure, indexes & queries.
Network requests handling and the response payload. This is kind of moving to the API territory, but it’s important to determine where the problems are coming from. Is pagination implemented where applicable? Are responses returning more information than necessary, hence bloating the models and making the data fetch longer than it could be? Are any tools, such as Firebase Performance, used to monitor network communication health?
Are there any decoupled long operations invoked in a synchronous manner? Are the Future & Stream APIs used efficiently?
Is there any multithreading via Isolates involved? Is it implemented correctly and is it justified and beneficial?
Analysis of background work implementation if applicable
Analysis of hardware API communications

Other

What are the supported OS versions?
What OS permissions are requested? Are there any that could get you in trouble with the app stores? Are there any that are not justified?
Analysis of app config files such as AndroidManifest & Info.plist
Are there any paid services used, such as Firebase or Google Places API? Is the usage cost-effective?
What is the app size, is there anything increasing it unnecessarily?
Is the CI/CD setup? What workflows exist?

Security

Group 31.png

It’s important to note here, that a full security audit of the system is a separate notion of its own and is not covered in this section. From the security standpoint, I analyze things present in the codebase. For a deeper dive into mobile app security standards, I can recommend checking out the OWASP Mobile Application Security Verification Standard (MASVS).

How are android release signing certificates handled? Are they stored in-app, checked into source control? What about the credentials?
What approaches are used for user authentication & authorization? If it’s session or token-based, does it conform to security standards?
How is sensitive user data handled? Is any sensitive data such as auth tokens, passwords, or credit card data stored in insecure storage, such as shared preferences? Is it logged to the console?
How are sensitive API keys stored? Who has access to them?
How is network traffic handled? Is HTTP allowed? Are there any trust mechanisms implemented, such as certificate pinning?
If there is SQL involved, how are the queries formed? Are they subject to injection attacks?
Is the code obfuscated in release mode?
Analysis of WebView usage, such as enabling javascript and providing access to mobile data.
Analysis of the general infrastructure surrounding the app: who has access to source code, what 3rd party SDKs are used, and who has access to their respective settings, etc.
Analysis of 3rd party SDKs usage that operates with sensitive user data, such as payments SDKs.
Analysis of paid app content access.

Results of the code audit

The result of the audit is a document with all of the findings in the listed domains, as well as prioritized short & long-term action plans aligned with the client's goal.

I hope you found this article helpful and that you can use this plan to audit your own app. Or request one from Chili - then there is a high chance that if it’s a Flutter app, I’ll be the one doing its audit 😜

Share feedback and talk to me on Twitter @dariadroid! 🐦

— Daria 💙

Share with friends