Telling stories with Old Bailey Data: An SHL development

This content has been created to explore approaches to the presentation of information derived from the proceedings of Old Bailey Trials (Sessions Papers). The datasets that this work relies on are output from 'The Old Bailey Online' and related projects.


This demonstration site is designed to illustrate how a range of tools for textual and data analysis, might be combined to create a working ‘macroscope’ – an environment where ‘big data’ can be explored both at scale, and at the level of the single datum. Its purpose is to allow a new ‘open eyed’ way of working with data of all sorts – to allow macro-patterns and clusters to be identified; while single words and phrases can be fully contextualised.

The site builds on the Old Bailey Online dataset, which encompasses accounts of some 197,745 trials held at the Old Bailey in London, between 1674 and 1913. The Proceedings contain some 127,000,000 words of accurately transcribed texts, which has in turn, been substantially marked-up in XML to encode the administrative process (crime, verdict, punishment, etc) reflected in the accounts. Additionally, text that purports to reflect direct speech has also been marked up to allow for analysis of verbal-linguistic change.

The Old Bailey dataset has been chosen because it combines a consistent textual record at scale; because it incorporates structured data; because it includes varieties of text types (speech vs administration); and because it represents a uniquely consistent dataset reflecting historical change over 240 years. But the purpose of this demonstrator is suggest a more generic approach to ‘big data’ as a whole – an approach in which large scale patterns can be rapidly explored, and their constituent elements identified; or where single words or phrases (or just datum), can be radically contextualised within the full body of evidence available.


data: records




Filter 1

Filter 2

Simple Filter

fixed x positions


Speech Lines

Speech lines are a timeline visualisations that describe the speech acts in a trial. Speech lines will be automatically generated for this panel once the data size is less than nnn - not implemented yet

The 'many timelines' and 'timeline by trial' buttons are legacy development results and remain as functional placeholders. Scroll to the bottom of the page to see the results. 'many timelines' is the result of not deleting an object before drawing a new one. In its current state it creates ~ a couple of thousand timelines, which is an interesting (if inefficient) way of looking at the data. This will be adapted to for use creating large sets of timelines.

Select a Trial

Will become a pair of dials to index through the sessions and trials. (e.g. radio dial or clock hands)


Please get in touch if you'd like to know more


Sussex Humanities Lab

Arts Road
East Sussex

Development Pages