button to main menu  Gents Mag, indexes and searching

button introduction

The Gentleman's Magazine


Thoughts about Indexes and Searching

These thoughts about indexes and searching were prompted by using the Gentleman's Magazine, published in London 1731 to 1907, or 1922.
source type: Gents Mag
Using indexes and using search tools are quite different ways are exploring a text source like the Gentleman's Magazine. Searching has recently become popular with the use of websites. What is offered by some websites, but not explained, is a search tool that apparently works on .pdf page images using Optical Character Recognition (OCR) software somewhere in the process. Other tools are based on more reliable machine readable text, but there are still difficulties. A major failing of searching is that the text being searched is old, it has odd letter forms, odd spellings, odd hyphenations, and so on. Looking for Windermere you will not find Wynandermere, Winder Meer, and other spellings, or even the current spelling if it is split over a line or a page. It is not reasonable to expect the searcher to know all possible variants, some of which might be one-off errors, which have to be searched for. It gets worse if the target of the search has changed its name at one time or another: Thirlmere was known as Leathes Water and Wythburn Water amongst other things as well as having variant spellings. If the target of the search is not narrow but a wide interest then things get completely impossible: would you know to look for William Gibson when looking for people of interest to The Lakes? The user can't be expected to know what to look for. Powerful and apparently successful search engines like Google leave you in awe at their power: BUT you don't know what they have missed.
Using an index is a different process. The user does the searching by scanning down a list of what is available presented in a useable order. This, too, has failings. Did the indexer include keys to every thing in which you have an interest, and did he key oddly spellt variants under both the source spelling and a standardised modern form? Good indexing depends on common sense, appreciation of the source, and of people's interests, supported by strong terminology rules, awareness of the value of variant forms, and so on. Its an art: its never perfect, but it should link you to the target of your search through all sorts of spellings and historical versions of names of people, places, etc. In the arrangement of the index keys a name like William Gibson will be presented in a list that makes his relevance to The Lakes apparent. Beware that although book indexing is often done very well, indexing to books in a collection, especially, for example, a public library, is usually extemely poor.

What To Index and How

Indexing the transcribed pages, stuff relevant to Cumbria, in a reasonably thorough way is not a small task. Using a purist approach is laborious: one example should demonstrate the work needed, the Content group for the Gentleman's Magazine 1745 p.604, record , G7450604.txt, could be (I have added an author for the purpose of demonstration):-
CONTENT
PERSON author: Smith, George & GS
PERSON soldier: Wade, George, Marshall
PERSON : Charles, Prince
PERSON soldier: Perth, Duke of
PERSON soldier: Ogilvy, Lord
PERSON soldier: Gordon, Lord
PERSON soldier: Pattenson, Thomas
PERSON unit: Murray's Regiment
PLACE Great Corby & Wetheral & Cumbria Cumberland) & England
PLACE Carlisle & Cumbria (Cumberland) & England
PLACE Penrith & Cumbria (Cumberland) & England
PLACE Warwick Bridge & Wetheral & Cumbria (Cumberland) & England
PLACE Stanwix Bank & Carlisle & Cumbria (Cumberland) & England
PLACE Brampton & Cumbria (Cumberland) & England
PLACE Rickerby & Stanwix Rural & Cumbria (Cumberland) & England
PLACE Warwick & Wetheral & Cumbria (Cumberland) & England
PLACE Blackhall & St Cuthbert Without & Cumbria (Cumberland) & England
PLACE Rockcliffe & Cumbria (Cumberland) & England
DATE 1745
PERIOD 18th century, early & 1740s
EVENT rebellion: 1745 Rebellion
EVENT siege: siege, Carlisle
OBJECT_NAME magazine & Gentleman's Magazine
OBJECT_NAME Rowcliff (Rockcliffe) & Rickarby (Rickerby)
The time taken to record all this is not small.
Thinking about what is wanted from the indexing makes a less purist approach attractive.
Not all the keywords in the above analysis are wanted, 'Cumbria' and 'England' for examples.
The planned indexes for many of the keywords do not require them to be in separately identifiable concepts; they will just be entries in a general index.
But I do want to be able to index, in a controlled manner, the magazine date and author.
So a simplified approach is:-
CONTENT
PERSON author: Smith, George & GS
DATE 1745
PERIOD 18th century, early & 1740s
0BJECT_NAME magazine & Gentleman's Magazine
TEXT_SECTION
KEYWORD Wade, George, Marshall & Charles, Bonnie Prince & Perth, Duke of & Ogilvy, Lord & Gordon, Lord & Pattenson, Thomas & Murray's Regiment & Great Corby, Wetheral & Carlisle & Penrith & Warwick Bridge, Wetheral & Stanwix Bank, Carlisle & Brampton & Rickarby (Rickerby) & Rickerby, Stanwix Rural & Warwick, Wetheral & Blackhall, St Cuthbert Without & Rowcliff (Rockcliffe) & Rockcliffe & rebellion, 1745 & 1745 Rebellion & Carlisle, siege & siege, Carlisle
This pattern is similar to the indexing approach already used, successfully, for guide book transcriptions.
Note that I am indexing for Cumbria interest; and I am using terms to match the Old Cumbria Gazetteer.

Article or Page?

Indexing in other elements of the Lakes project transcriptions has always been done record, ie page, at a time. The idea of indexing by article rarther than page for the Gents Mag was considered and rejected.

button to main menu Lakes Guides menu.