Oddities in request times for plain-text passwords in MATRIX LMS.

Written on Feb 12, 2021.

Takes 4 mins to read.

Image © XKCD

Introduction

I have recently been looking into MATRIX and NEO LMS’s security having written (to date) two white papers describing a few vulnerabilities.

During my research I noticed a button, on a user’s page, to show that user’s credentials, this immediately got me thinking that passwords were stored somewhere unencrypted. A quick visit to the product page assured me that passwords were indeed encrypted, but having worked with the software, knowing the level of security implemented, and seeing the response times regarding the endpoint that provided the user's credentials I decided to look into it.

Hashing passwords is currently the de-facto standard as hashed (and subsequently salted, and sometimes even peppered) passwords tend to be the most secure way of storing passwords as these are very difficultly rainbow tabled, and can't obviously, be algorithmically found such is the nature of a hashing algorithm.

That being said, there are other ways of storing passwords, if it's necessary to be able to display the password in plain text. An encryption algorithm (the likes of RSA or AES) with a properly secured key or keypair is a good way to go about this. But in this case, I believe there's an argument to be made that there is no need to display passwords in plain text, there's a button to reset a user's password which generates and emails a password to them, clearly having the ability a user's password is not only unnecessary, it's also negligently insecure.

The data

But back to the matter at hand, I believed that the passwords aren't even encrypted due to the response times I was observing ~300ms, most of which would be getting data from the database and verifying authentication keys (more database requests) as the endpoint was secured.

My assumption was based on simple math, usually, when I'm working with databases I see an average query response time of about 70ms (off-site database), knowing that the software has to make at least 3 queries (or two if they are optimized, which again, I have reason to doubt) one to verify the session cookie, another to verify the username cookie and a third to get the password we're looking at 210ms (or even 140ms if we're assuming that there are only 2 queries) that leaves us with 90-160ms to decrypt the password, JSONify it (admittedly the latter is not very time consuming) and return it to the user. So we'd have to decrypt a password in 90ms on an AWS node, I found that quite suspect.

So I resolved to design an experiment to prove it, but first I needed some data regarding MATRIX, so I created a script to measure response times and settled on a 500 datapoint sample size (10 iterations of 50 requests, as to not trip their filters or overload the server with traffic)


                # Program to test time taken to return a user's password


                import requests
                from datetime import timedelta
                import csv
                
                
                base_url = ''
                endpoint = ''
                
                cookies = {}
                timing_list = list()
                
                for _ in range(50):
                    r = requests.get(base_url + endpoint, cookies=cookies, headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0'})
                    if r.status_code == 200:
                        timing_list.append({'Response Time': round(r.elapsed / timedelta(milliseconds=1), 0), 'Cached': True if not 'Miss' in r.headers['X-Cache'] else False}})
                    else:
                        break
                
                with open('timings.csv', 'w', newline='') as csvfile:
                    fieldnames = ['Response Time', 'Cached']
                    writer = csv.DictWriter(csvfile, dialect=csv.excel, fieldnames=fieldnames)
                    writer.writeheader()
                    for time in timing_list:
                        writer.writerow(time)

When I went about running the script and processing the data in excel, and chart the data, the oddities began.

I was baffled by the results, I didn't understand what was causing all the inconsistencies. It looked like someone had handed a crayon to a chimpanzee and it'd gone to town on a peice of paper.

Were the results to dip at the start, I could've reasonably concluded that the response was being cached, say we started with 500ms and dropped down to ~300ms and stayed there, it'd be a reasonable conclusion that one of the steps, be it the database requests or the actual response was being cached. I'd also quickly conclude that a rise and dip at the end, say we're at ~300ms and we go up to ~500ms but then come down to ~300ms again could be the cache expiring but the request getting recached. What I saw was beyond explanation, at least to my knowledge.

With this insight, I attempted to verify if one of the environmental variables was distorting the results, maybe it was my internet connection or my proxy?

I quickly built a system in Flask to test this, it grabbed an encrypted password from a PostgreSQL database, decrypted it (using python's RSA module), and finally returned it as JSON. I put it on Heroku, using a free dyno in Europe, and ran the same script.

The results were, as expected, normal, there were a few peaks here and there but I put them down to Heroku's routing system and/or my proxy.

It becomes weirder when we use a scatter plot to visualize the MATRIX data, what we see is a lack of any discernable pattern. You can see the cluster where the majority of the requests are (the ~300ms range) but there's a siginificant amount of requests that derviate a fair amount from that group

Moreover, in the final 10 requests, you can see a few runs where the times are all sub 200ms which, in my opinion, is insufficent time to get the values from the database, check the validity of the session (notice this takes at least one more database request), decrypt them and serve them to the user.

That being said, I cannot, and do not trust the data, it is too inconsistent, additionally, my tests did not seek to accurately emulate their production envoriment, so I will not be forming any conclusion regarding them encrypting, or not encrypting the passwords.

Testing Limitations

Please observe the following assumptions were made:

The database is not on the same node as the web interface

Moreover,

Their infrastructure is not being faithfully emulated here, MATRIX uses RoR whereas my tests used Flask which is faster
We only tested PostgreSQL as a database engine, they could be using MySQL or any other database engine, or even be using a Redis + PostgreSQL combination to acellarate requests
A proxy was used to make every request, that adds latency, all be it, the same proxy was used for every test, essencially nullifying it's effect
There has not been any mathematical ajustment for bias or errors

Additionally, please note that AES was tested, but there was no siginificant difference from RSA, as such it has been omitted

Response by Cypher Learning

All personal passwords (whether supplied by a user during account creation or provided by an admin during account setup) are encrypted using individual SALT values and the original passwords are never logged, emailed, or otherwise available. Auto-generated strong passwords (for example, if bulk account creation is used and passwords are omitted) are intentionally not encrypted so that they can be sent to learners via email (this is a common customer requirement). You can configure our system so that this auto-generated password must be replaced on initial login, and the password that is then entered will be stored encrypted.

- Adrian Alberto

This means that user-set passwords are indeed hashed and salted, but auto-generated ones are not. I do understand why this is the case, all be it, in my opinion, storing unencrypted passwords, and not forcing the user to change them (having, in this case, the option to force to change them, turned off by default) still represents a security vulnerability were the database ever to be compromised. Furthermore, there is no need to have passwords be unhashed, you could simply generate the password, send it to the user, hash it, and then store it in the database without ever having the password unencrypted in storage.

Written by Mauro M.