The traditional approach to website design is to build a single site that caters to all audiences. The site is static and is not able to dynamically adapt to the user context. This approach is akin to a trading strategy that is not aware of the regime. The outcomes will not be optimal.
The user experience can be improved through the identification of audiences and the creation of optimised pathways through the site. Users are able to self-select into a context through the navigation. Such an approach mitigates against the static nature of the site without the need for complex calculations based on the user profile.
A more sophisticated approach involves the implementation of content personalisation and content recommendations, delivering adaptive content suitable to the user. These approaches take advantage of the user context and are able to achieve improved user experiences. We have written extensively on these topics on the Morpht blog over the years. In this article we extend the ideas to the delivery of search.
Financial regimes
In the world of finance, the concept of “regimes” is used to improve the predictions being made by an initial model. Knowing the context (volatility, market direction, etc) will help the model optimise signals based on the prevailing conditions. A simplistic approach would be to devise regimes for bull/bear and calm/volatile markets. More expansive regimes may include any number of regime-based features, which may help the model make better predictions. These regime features are then included with the asset features to provide a more complete picture for predictions.
In finance, regimes can be implemented via:
- Simple filters to exclude certain assets from consideration,
- Inclusion of individual regime characteristics as features for a primary model,
- Use of regimes and outcomes via meta-labelling for a secondary model
- Conditional portfolio optimisation involving regime features,
- many more, no doubt.
In short, regimes are a big deal in trading strategies as they help improve outcomes based on the prevailing context. The environment can change, and so can the efficacy of a strategy. What is profitable one day may not be the next. Regimes can help with this.
What do "regimes" mean for search?
What context can be used to improve the results?
In a previous article, I reviewed the various components that comprise relevancy for search. Search is not just about the keywords - it is about the context the user is operating in. This can include concepts such as search history, browsing behaviour, time and the meaning of the content. Much of this relevance can be considered as a regime that needs to be used to condition the results being used.
Each user is different and will have a different set of interests. These interests form the “regime” in which the search will be conducted. What makes a successful set of results will depend on who the user is and what they are interested in. Results, conditioned by the interests, will contain fewer false positives and result in a happier user.
How can we determine user interests?
Explicit regimes: Facets and Filters
The easiest way to find out about what someone is interested in is to ask them. You may not always get a straight answer, but it will be quick to convey and will mostly be context-specific.
Traditional search, which is based on keywords alone, had no access to the user context. In such an environment, context could be determined through the user selecting filters and redefining the results through facets.
The use of faceted interfaces offered a lot of promise. Well-designed facets can slice and dice a large corpus across different dimensions, allowing the user to filter their way to success. Common facets included “audience”, where a user self-selects into a persona, and topic, where a user declares a particular interest. These two dimensions are a reasonable way to divide content.
The upside to this approach is that the user remains anonymous and in control. Faceted searches will improve accuracy.
There are downsides, though:
- Effort is required to configure the interface.
- Content is divided into limited dimensions, missing nuance.
- Content is still primarily retrieved via a keyword search.
An interesting modern take on users being able to control the results delivered is the recent announcement that Google will allow users to specify their preferred new sources. This explicit opting in to result sets is an interesting development in the world of search. Users are gaining more control through the explicit configuration of what they want to see.
Implicit regimes: User behaviour
The behaviour of the current user does provide a “regime” or context. In order for this “regime” to be calculated, user behaviour needs to be tracked. For each item a user interacts with, an event needs to be logged for that user. Tracking of behaviour like this takes place with analytics software, personalisation engines and customer data platforms (CDP). Each of these systems can form a picture of the user based on their behaviour. The past history can be used to build a picture
Personalisation engines are perhaps the best example of how this implicit regime is utilised. In practical terms, a personalisation engine is able to use the user behaviour to develop a conceptual “picture” (vector) of the user. This picture can then be compared to the returned items to weed out the irrelevant items. ie. The user profile acts as a regime to improve the accuracy of results and the user experience.
Search engines have utilised user behaviour by tracking page views and link clicks. Such an approach was pioneered by Google. In the world of SaaS search, we have seen more sophisticated search engines consider user behaviour. For example, Sajari (acquired by Algolia) used user behaviour in this way to alter results. Algolia now provides such capabilities. Personalisation engines, such as Recombee, are also able to utilise behaviour to customise results according to various scenarios and queries.
The use of user behaviour in this way is not readily available to CMSs such as Drupal. Web servers usually sit behind a CDN, and there is often no built-in concept of a user ID that persists across sessions. Users are mostly anonymous and unknown to the application. This moat protects users, but it also means that more precise results are harder to deliver. This is the main barrier to the delivery of more personalised experiences. The solution has generally been to go to a SaaS provider who can manage the hard stuff on an external platform. There are potential opportunities for CMS providers to address this problem. It isn’t just a data collection issue though; it does come down to algorithms and data processing. A SaaS solution is still attractive for these reasons.
Semantic search: Vector databases and embeddings
Users now demand far more sophisticated and natural search interfaces. User behaviour has changed in a number of ways in recent years:
- No more crafting special keywords. Move towards concepts over words.
- Queries are now more likely to be questions rather than “titles”
- Chat and conversational interfaces are proliferating.
- Search results pages provide answers, not items.
As a result, vector databases are now becoming more popular and are opening up new ways of interacting and new business models, as content can be understood on a semantic level. Content is converted to embeddings, and search queries can then be matched to these embeddings.
Conceptually speaking, the user profile can be converted to a vector, and this vector can be used to match the user to content. Sophisticated search engines could use the user profile to further condition results.
Prompt engineering
There is possibly a more direct way to customise results, and that is through the prompt. Rather than passing vectors through in the backend, the prompt can be used to collect the user request with conditioning to customise that request.
Recently, GPT-5's prompt was leaked. From this, we can learn that system prompts make use of guard rails and role-based conditioning for the responses. This is perhaps not such a groundbreaking revelation. However, it does show us how the prompt can be used to condition the results that are coming back, in effect introducing the regime into the query. We can use this information to craft better prompts. For example, user interests or context could be included.
Let's create an experiment on a toy demonstration website Morpht has created: GovFlix. Let's pass in the user “audience” when making the request. Try the following requests in the chatbot:
- I am a ux and design professional. What challenges face governments in delivering websites?
- I am a security professional. What challenges face governments in delivering websites?
- I am an editor. What challenges face governments in delivering websites?
You will see that the results returned are similar but do vary according to the audience provided. The results are not that different (due to the small number of hits), however, there is a difference in the content that has been returned.
Such an observation opens the way for easy personalisation of search interactions. If the client system can assign a persona to the user, and possibly a set of interests, the results can be conditioned. This opens the way for a better experience, without having to track the user behaviour in an external system. ie. The context can be passed in through the prompt, without any need to alter the core semantic search technology.
In a more concrete example, a tool such as Convivial Profiler could be used to derive the current audience of the user. This could then be passed in as part of the prompt, conditioning the search query.
More broadly, it is clear that chatbot solutions will increasingly track previous conversions to build user profiles and to better derive intent. The user context will perhaps be best derived by inspecting the conversational history of the user. What better way to work out intent? This may well be more direct and effective than observing the user’s page view behaviour. It points to the conversational interfaces becoming the default way users will wish to interact with a site.
Final thoughts
This article has started from the premise that reducing false positives is helpful in delivering better experiences. The use of a "regime" in primary and secondary (meta-labelling) models can be used to get better results conditioned on that regime. Traditional ways of conditioning responses through facetting and filtering have been effective but do require explicit user action. They are now giving way to user profiles, embeddings and prompt conditioning to better personalise user experiences.
Content management systems, such as Drupal, have been able to make good use of backends such as MySQL and Solr through the Search API integration module. This has provided Drupal with a solid search solution over the past decade or more. Drupal now continues to innovate with initiatives such as the Drupal AI Initiative, where semantic search backends are now being added to the mix. This promises to improve search results significantly by moving beyond the keyword model. However, some things may be lost along the way, such as facetting and recency boosting. The technology needs to be selected to fit the requirements.
The next logical step for Drupal would be to incorporate user behaviour into results. This is, however, a complex field because the challenges are not just to do with content. Broadly, the definition of relevance brings many other considerations into the picture as outlined above. It will become necessary to consider different scenarios for the content being returned. Each result set needs to take into account the context in which the user is expecting the results. A deeper understanding of relevancy will require more nuanced approaches such as these.
This complexity requires a deeper understanding of the search problem, and one where SaaS solutions may continue to retain the upper hand. Can generic tools deliver the features needed? The addition of semantic search features is one step towards the future. Including user behaviour is the next step. This will involve improvements in algorithms and a more nuanced approach to delivering the best results, depending on the context.
The most likely outcome is that chatbots and conversational interfaces will continue their rise to domination. Semantic search systems can synthesise large amounts of content and present this in ways that are more convenient to users. The ability to leverage the chat history as context is a powerful advantage that should see it supplant the previous dominance of search. The user context, as a regime, will improve the search experience to such an extent that traditional search will appear to be inferior for most use cases.
The delivery of improved search interfaces is a huge opportunity for website owners, allowing them to leverage their precious content. Websites will need to be able to deliver on improved user experiences if they are to remain competitive with external systems, which have also indexed the content.