The first speaker in the first paper session at the SEASON 2025 conference is Frans van der Sluis, whose focus is on information quality in information retrieval. Judging information quality is hard: often, there is an inherent uncertainty around truth, and a question about whether a statement can be reliably justified as true. Practices such as cherry-picking, bothsidesism, and framing exploit such uncertainties. Overall, then, information quality might mean any of accuracy, comprehensiveness, expertise, usefulness, bias, and more.
Search engines tend to objectify such quality, mostly by assessing relevance; Google also seeks to includes expertise, authoritativeness, and trustworthiness factors, though. In addition, some documents may come with fact-checking labels, but this also coerces a lot of different considerations into a single label in problematic ways. This produces an opaque information ecosystem where users have few clear markers that they can use to navigate this space. This means that users tend to perform poorly at recognising quality; this goes for ordinary users as well as specialists in various fields, in part also due to lack of time, ability, or motivation.
Instead, users navigate based on cues, including informational (ranking, source), social (votes, endorsements), and experiential (fluency, confidence) cues – this can lead to overconfidence and misunderstandings.
There is, then, a missing middle between objective and subjective assessments – that missing middle might be described as intersubjective validity, which is neither objective truth nor a purely subjective assessment, but a sense of whether users believe there is a middle ground, and that we should be able to agree on quality.
This project explored this via a study of online fora: if quality claims are contested, then this means that participants believe that people should be able to agree on quality; they then engage in certain practices which share a distinct goal, are cooperative, and are socially constructed. The team identified such practices through manual coding of forum posts, prominently identifying quality arguments drawing on correctness, usefulness, comprehensiveness, and expertise.
These were further explored in an experimental study by confronting participants with a set of vignettes where they were required to judge intersubjective validity. Can such results be applied to search environments, then? In such contexts, intersubjective validity is a design question: how can this be promoted, and what signals should be provided to users in such contexts? Community Notes on Xitter are an example of how such designs are currently deployed in social media contexts, but how does this translate to search?
Ultimately, reliability is intersubjectivity at scales it emerges from the observation of relations between claims: when claims align, they support higher reliability; when they don’t, they signal higher uncertainty. But search engines tend to resist revealing the context and reasons for their ranking approaches, and reliability cues can also slow down and confuse users, so there’s more work to be done here.