SonarQube 9.9 LTS sports a powerful Python analyzer, with 250 (okay, 249) rules for making sure that Python developers can write Clean Code that is fit for production and fit for development.
In this LTS release, there are significant advancements in Python analysis compared to SonarQube 8.9 LTS. Grab a coffee and get comfortable as I walk you through these improvements!
Using SonarCloud? You'll find all these improvements there as well.
To provide accurate analysis, SonarQube relies on type information for the Python standard library as well as common libraries used by Python developers. This type information is provided by Typeshed (a collection of Python stubs).
In SonarQube 8.9 LTS, this information was calculated at analysis time, which was expensive. It also wasn’t possible to collect all the information available, such as conditional type information based on the version of Python being used.
SonarQube 9.9 LTS extracts far more data from Typeshed with better performance (calculating symbols once, shipped with SonarQube, not on each analysis), leading to an analysis with better performance and better results.
As just mentioned, SonarQube can now take into consideration type information specific to the version of Python being used.
Python 3 has many breaking changes compared to Python 2, which has an impact on our bug detection rules when some code pattern is a bug in Python 3 but not in Python 2!
Developers using SonarQube 9.9 LTS can now set the
sonar.python.version analysis parameter in order to detect issues specific to Python 2 or Python 3.
Consider this piece of code:
There's a problem here if you're using Python 3: The
filter API returns an iterator that does not have a
__getitem__ method. This isn't a problem with Python 2, because the same API returns a list. This is an easy mistake to make if you're migrating your codebase to Python 3.
SonarQube 9.9 LTS, knowing the version of Python being used, can properly raise an issue on this code.
Speaking of Python versions… a new SonarQube LTS means support for new versions of a language, which requires SonarQube to update how code is parsed and understood in the context of raising issues.
This also means that existing rules have been updated to not raise false-positives on these constructs either.
SonarSource acquired RIPS Technologies all the way back in May 2020 (there were a few other things happening in the world in Spring 2020, so don't worry if you forgot)!
Not only did we gain many great colleagues, but we also acquired their advanced technology for detecting vulnerabilities in Python. After months of work, we took the best of the Sonar & RIPS engines to produce a new security engine for Python. We actually replaced the engine entirely, moving from so-called fixed point analysis to symbolic analysis.
This means the security engine for Python is now field-sensitive, and commercial editions of SonarQube 9.9 LTS can precisely track which field of an object is tainted (or not) by malicious user input. For you, this means fewer false-positives so you can concentrate on fixing real vulnerabilities, not analyzing the fake ones.
On the topic of false-positives, it wasn’t only security rules that saw improvements. Sonar puts in a significant amount of effort to make sure only true issues are raised, and our developers are always reviewing issues raised by Python rules to make sure they are accurate and relevant. They also receive reports from our community and through commercial support channels.
Not counting all of the FPs fixed by updates to the analysis engine, there were 31 specific false-positives our developers addressed in SonarQube 9.9 LTS!
Sometimes it's easy to get so focused on the impressive new rules that, stepping back, we see there are some less complex (but still important) rules that need to be implemented!
SonarQube 9.9 LTS brings nine of these rules that are commonly provided by other linters, such as tracking TODO tags and making sure copyright/license headers are included on each file.
You can find the complete list of these rules here.
Regular expressions (regex) are sequences of symbols and characters expressing a string or pattern to be searched for within a longer piece of text. Regex is an incredible tool to express conditions that would otherwise require many lines of code to catch the same pattern.
While using regex is quite typical for developers these days, that does not make it easy to handle. Writing regexes is error-prone and time-consuming, and they're difficult to document well. Once they are written, identifying errors in them can be extremely difficult.
Not only are they difficult to write, but due to their size and complexity, they are often difficult to read and understand.
Take this example:
The third capturing group in this regular expression is
(watch\?v=|embed/|.+\?v=)? to account for variations in the URL format. You might not have noticed that the third alternative in this capturing group,
.+\?v=, is redundant, as it's already covered in the first alternative
watch\?v= and will never apply to
So this regular expression can be simplified by removing the redundant alternative group, giving us a slightly more readable:
That would have been hard for a developer to spot on their own. It's not hard at all for SonarQube.
In SonarQube 9.9 LTS our developers introduced 21 new rules to help Python developers, write efficient, error-free, safe, and less complex regular expressions! You can find all the Python rules related to regular expressions at rules.sonarsource.com.
If you're using the
pytest frameworks to write your Python unit tests, you’re in luck, because SonarQube 9.9 LTS adds rules specifically related to analyzing your test code.
More and more developers are using the AWS CDK to describe their AWS infrastructure, combining the flexibility of a programming language with the complexity of cloud infrastructure.
The CDK provides preconfigured and experience-tested default values, but the creation of patterns and structures can still lead to security misconfigurations.
SonarQube 9.9 LTS provides 19 rules to raise security hotspots on AWS CDK code written in Python, to make sure your IaC is as secure as your source code.
SonarQube 9.9 LTS adds support for detecting advanced Python bugs using symbolic execution.
The purpose of a symbolic execution engine is to visit all feasible execution paths, even across method calls, to find tricky bugs located in the source code.
Consider the following piece of code:
In this example a variable is initialized to
None in a function and its value is used in another function. Accessing an attribute of
None triggers an
Here's a more complex example:
There's a lot going on here:
_nis an alias for
- Given that the fourth arg of
w_objwould be None, an exception will be raised, hence there will be no return value from
- Now, the only possible return value of
get_fieldmust be something different than
- Hence, the condition
_n is Noneis always False, and some subsequent code is never evaluated.
Mistakes like this are very common and can be very difficult to work out on your own. SonarQube 9.9 LTS now raises issues in these cases, with nine total rules detecting similar complex bugs.
These rules are available in commercial editions of SonarQube.
SonarQube is made by developers, for developers. Our goal is to help all developers be able to write Clean Code.
If you haven’t tried SonarQube 9.9 LTS yet, I hope you now have even more reasons to prepare this upgrade with your team. This is a free version upgrade for all, and you can get the LTS in just a few clicks at SonarQube Downloads.
Need more help getting started? Check the following resources:
- SonarQube LTS Upgrade Checklist
- Get help upgrading using the 9.9 LTS Upgrade category of the Sonar Community
I'd like to thank fellow SonarSourcers Alexandre Gigleux and Andrea Guarino for their contributions to this blog post.