Charles Explorer: Intelligent Search Engine Across the University

December 20, 2024

From time to time, searching for specific information in SIS (the Study Information System of CUNI) can be challenging, especially when you need to look across different faculties. Jindřich Bär, a recent graduate of the Master’s programme in Software and Data Engineering at the Faculty of Mathematics and Physics, decided to change this. As part of a company project and subsequently also his thesis, he developed Charles Explorer, a tool that allows users to search for courses, study programmes, publications, and staff across the entire Charles University.

Could you briefly introduce your project?

Charles Explorer is a web application – a search engine for courses, study programmes, publications, and academic staff at Charles University. My focus is on user comfort and clarity, enabling users to easily connect information across various university systems. For example, the system allows quick searches for relationships between academics and their publications or the courses they teach.

What inspired you to focus on this topic?

The existing university systems (SIS, Verso, etc.) lacked data integration. Information about publications and their authors is stored in Verso, course details are found in SIS, and details about study programmes are scattered across individual faculty websites. This inspired Professor Skopal, the supervisor of my project, to propose a tool that would connect these data and simplify their search. At the same time, I wanted to create a platform that would be more intuitive and accessible for students, academic staff, and the broader public as well.

Can you explain the specific benefits or applications of your work?

Compared to existing systems, Charles Explorer offers a more user-friendly experience (UX), responsive design, and the ability to view data in context. Additionally, relationships between people and publications can be displayed in a graph (network) view, allowing users to explore data from different perspectives.

This helps make the university more accessible to the broader public or prospective students, for whom the SIS course pages might appear too technical. For every entity, we still provide links to existing systems to further enhance data integration. Where possible, we also publish data as Linked Data, which, for example, improves the presentation of individual entities in web search engines (e.g., Google Rich Results) and facilitates further data processing.

What technologies did you work with, and what methods did you use?

I worked with Node.js, PostgreSQL, Apache Solr, Memgraph, and Docker. I had prior professional experience with Node.js, while the rest of the stack aligns with what I found to be the current industry standard. The widespread adoption of these technologies ensures extensive support, integrations, tools, and active community forums where both questions and answers are readily available. As a result, I didn’t spend too much time stuck on any single problem.

What was the most challenging part? Is there anything you would do differently in hindsight?

Before relying on established frameworks mentioned above, I spent about a month experimenting with my own implementations of databases and full-text search. This caused far more harm than good – I wasted a lot of time debugging “my” database, which significantly slowed down the development of the actual web application. Fortunately, I abandoned this approach in time – sometimes, there’s no need to reinvent the wheel.

How did you verify the results of your work?

For Charles Explorer as a web application, I monitor user accesses through Google Analytics and Google Search Console. During development, I also performed regular load testing and profiling of the web application.

Within my thesis, I focused more on the data itself rather than the application. Here, I used standard methods for evaluating ML models, such as cross-validation and F1 score. I also compared search results from Charles Explorer with commercial systems like Elsevier Scopus and similar platforms.

It’s also interesting to observe the growth in the number of user comments – sometimes users notice discrepancies in the data and notify us via email, which helps us continuously improve the system.

What do you consider the most important result or conclusion of your work?

I believe the key achievement is that we successfully connected various university systems and provided users with an easier way to access relevant information. Charles Explorer simplifies access to data, which can be valuable for students, academic staff, and the broader public.

I think it adds significant value, particularly in terms of better data navigation and user-friendliness. Additionally, the Explorer establishes a platform for the development of future tools for exploring university data.

Do you think that your work can serve as inspiration for other students or professionals in the field?

I think the project can demonstrate how various university data can be effectively connected and made accessible to users. It doesn’t necessarily have to be an inspiration, but I hope it can show others how similar challenges can be tackled and perhaps serve as a solid foundation or starting point for future projects. Every project is unique, but if my experiences help someone, I’ll be happy.

What are your future plans?

I truly value the experience I gained from this project; it not only helped me grow personally but also enriched my portfolio. I plan to maintain the project for as long as needed until a suitable successor appears. While working with data was an exciting challenge, I now plan to take a short break from this field. I’d like to focus on some of my smaller hobby projects, though I’m not ruling out a return to data systems if an interesting project or opportunity comes my way.


 

Charles University, Faculty of Mathematics and Physics
Ke Karlovu 3, 121 16 Praha 2, Czech Republic
VAT ID: CZ00216208

HR Award at Charles University

4EU+ Alliance