Musician at Matfyz, computer scientist with a saxophone
Šimon Libřický, a student from the Prague Music Computing Group, has merged music with computer science to create a model for the difficulty of saxophone scores. His bachelor's thesis, which addresses a previously unsolved problem, was evaluated by the committee as being at the master's thesis level. Šimon is also a co-author of an article on this research that has been accepted to the prestigious international conference ISMIR 2025. What led him to such an achievement? “I wanted to make life easier for wind players,” he explains.
Could you briefly introduce your work? What is it specifically about?
For a musician who is trying to decide what piece to practice, it would be great knowing how easy or hard a given piece is. It would be even more useful if they could have an annotated score showing them which specific musical phrases to watch out for when practising. For self-learners, it would be best if this annotation could happen automatically. Systems like this exist for the piano, violin, and guitar, but not for any woodwind instruments.
To solve this problem, I created a model for annotating musical scores based on difficulty for the tenor saxophone. This model can create a visualisation of difficulty as seen in the picture (the textual annotation is not generated by the model, it simply serves as an explanation for non-saxophonists as to why these specific phrases are difficult).
The model works on the principle of finding an optimal path in a weighted graph, where edge weights are dictated by the maximum trill speed between two fingerings.
I primarily focused on ensuring that the methodology and model architecture are easily transferable to other woodwind instruments. Models for similar instruments can, using my findings, now be created with significantly less effort and time.
What inspired you to focus on this topic?
I’ve been playing the clarinet, saxophone, and writing music since I was little, so this topic is close to my heart. It’s also nice that I can apply this model in my own work, where it serves as an automated ‘early warning system’ when writing saxophone parts.
There’s a newly founded research group at Matfyz called the Prague Music Computing Group, led by MgA. Jan Hajič jr., Phd; at this group’s lectures on musical informatics and digital musicology I discovered just how neglected wind instruments are in the entire research field. I found that to be a great shame, so I wanted to write a thesis that could improve the lives of woodwind players.
Can you explain the specific benefits or uses of your work?
The biggest benefit is the actual model itself. I have managed to create a Musescore 3.6 (music notation software) plugin implementing the model, allowing users, composers, researchers, and the like to use it in their real workflows and not just as part of some tech demo.
However, there are many side benefits. As part of creating the model, I created a novel corpus of saxophone trill recordings, which were used to train the model. A pipeline for automatically processing these saxophone recordings was created, which can be reused for other instruments. Analysis of the recorded trills brought to light many unexpected insights about the variance of trills between similarly experienced saxophonists. Additionally, I showed how expert knowledge from pedagogical literature can be used to improve model performance.
Importantly, I showed that the woodwind family can be studied in the same way as the piano and guitar; we can also accomplish quite a lot even in situations with limited access to data.
What technologies have you worked with, what methods have you used, and why these?
All implementation was done using Python due to the availability of machine learning libraries (sklearn). For frequency analysis of the audio recordings I used CREPE, a state-of-the-art “fundamental frequency prediction” library. The transfer from the audio domain to the symbolic music domain was achieved using librosa, which is one of the most complete libraries for audio and music analysis.
To fill in data for unrecorded trills (when training future models the aim is to not have to record everything) I trained a relatively small and simple multilayer perceptron model. I was working with an extremely small dataset, so more complex models would not have helped. I also wanted to show that, with the help of pedagogical knowledge, we did not need to record more data.
What do you consider to be the most crucial result or conclusion of your work?
Apart from the final model, the insights into the data proved to be the most significant contribution to literature. My conclusions on how little data is actually needed to create a successful model, and the findings about the nontrivial difficulty of normalising our data can save a lot of time and effort for those people, who want to record their own datasets and create their own models.
What was the most challenging part of writing your work? Was there something you got stuck on, some path that didn't lead anywhere? Is there anything you would have done differently in hindsight?
The most time-consuming aspect was for sure the data acquisition part. I worked with multiple conservatory saxophonists; figuring out the where and when was always a challenge (I’d like to thank the faculty building management for finding a suitable place of recording).
If I had to start over again, I’d focus a lot more on creating a more modular and flexible codebase from the beginning. As I thought of new experiments I wanted to run, I had to keep refactoring old code to make it suitable for what I wanted to do.
How did you verify the results of your work?
There are no “ground truth” datasets about difficulty for the saxophone, so complex evaluation of the model was not possible. Designing the method of data collection and doing the actual data collection would be a bachelor’s thesis all of its own, seeing as one would have to wrestle with what ‘difficulty’ means in a more comprehensive manner. Despite that, my model’s output corresponds to my intuitive understanding of difficulty, at least when it comes to the relative difficulty between musical phrases.
Evaluation of the aforementioned neural network for infilling missing data was done using a standard cross-validation on the recorded date (where you set aside some subset of the recorded data, train the neural network on the rest of the data, and then evaluate the model based on the data that was set aside). I managed to get pretty close to the theoretical baseline values, which were derived from the level of variance in our recorded data (even though there is still room for improvement).
Do you feel your work can inspire other students or professionals in the field?
I believe that it can. There are many musicians at Matfyz, and I think it’s a pity that more students don’t try to connect these two parts of their life into some kind of interdisciplinary work. I conceived of this thesis project on my own because it was something I myself would use. For those looking for thesis supervisors, don’t be afraid to come to them with your own thesis ideas. I’m sure they will appreciate that you will find the work more fulfilling and interesting.
Doctor Hajič and I turned this thesis into a research article that got accepted at ISMIR 2025 (International Society for Music Information Retrieval), which is the most prestigious conference in the field of MIR (music information retrieval); I hope we are able to get more people in the field interested in working on woodwind instruments.
What are your plans for the future?
Currently, I am going to (for at least a few years) end/pause my studies at Matfyz and move to the Netherlands to pursue a Bachelor’s degree in Jazz Composition. Despite me seemingly changing fields, I hope to continue finding new music-related software projects to work on.
The field of MIR and digital musicology is growing, mainly due to greater access to more data, and I would like to grow alongside the field. I still plan on being active as part of the Prague Music Computing Group – few universities have such a developed research group devoted to the intersection of music and computer science.
Alena Chrastová