How to Search

Whether you’re new to the PGP or an experienced researcher of our corpus, understanding our search tools will improve your ability to find specific entries, easily browse the PGP corpus, and familiarize yourself with geniza materials. Though you can start by typing in a keyword of interest, sometimes that will bring up too many results to sift through. Fortunately, there are a few ways you can refine the results to get to the information that will most directly help your research.

There are two main ways to search the PGP:

  • General search: use keywords or phrases in any language to return matching or similar results across all fields. Arabic script searches will return both Arabic and Judaeo-Arabic transcription content.
    • Searching all fields means you will access content that belongs to document descriptions, transcriptions, translations, and the metadata about the document’s date, language, related people, etc.
    • Use this search if you don’t know where to start!
  • RegEx search: use Hebrew or Arabic script to find precise matches in the transcriptions.
    • Regular Expression (RegEx) search assists research into the original language and content of the documents.

Search and Filter Results

Across all search types, with or without a search term, you are now able to filter your search results by types of date, document type, and whether images, transcriptions, translations, or scholarship records are available.

The filters are also additive: if you select Has Transcription, you can also select Has Translation and the results will update to show how many documents have transcriptions AND translations.

If the filter button is colored (green on light mode and berry on dark mode) that means that filter is still operating and impacting the number of search results you see.

Search and Sort

The PGP has several ways to sort the results you get:

  • Relevance - This is the default setting and should give you the most relevant documents for your search term
  • Scholarship Records (Most/Least) - This will show your results in order of how many scholars have discussed, edited, and translated the document. Note that the PGP has not systematically added scholarship records for any category other than editions that appear in our database.
  • Document date – This will sort documents with dates in chronological order.
  • Input date – This will sort documents by date of when the document was entered into the PGP database.
  • Shelfmark – This will arrange the results in alphabetical order according to their library’s designated shelfmark.
  • Random - This generates a page of random documents to help you find new and interesting material!

General Search

General Search at a Glance

  • You can search using a keyword or a phrase in English or the original language of the document
  • You can search by field, such as shelfmark or PGPID by prepending your search with the field and a colon (e.g. shelfmark:)

Search for an exact phrase

If you want to look for a particular phrase in the description, wrap your search terms in quotation marks. (If you’re looking for exact phrases in the transcription, we recommend you search in RegEx mode.)

  • For example, if you wanted to search for records that reference Avraham Maimonides, son of Moses Maimonides, you could search “avraham maimonides to distinguish him from his father, or “nagid avraham” to distinguish him from other people named Avraham in the database.

You don’t have to use quotation marks for single words, only for phrases.

Combine Terms

Boolean operators are connectors that let you control how your search terms are combined.

The default behavior is OR — to match any term.

  • If you search for avraham maimonides, the search will return all results that contain the words “avraham” or “maimonides”.
  • OR is an inclusive operator, meaning that it returns results containing any of your terms.

Using AND allows you to combine search terms — to match all terms.

  • For example, if you want to find all results for responsa written by Avraham Maimonides, you could search “avraham maimonides” AND responsum.

Using NOT excludes certain results.

  • For example, if you wanted to see all documents written by Avraham Maimonides except for petitions, you could search “avraham maimonides” NOT petition.

Search Within a Specific Field

If you want to search within a specific field instead of across everything, you can specify a field name using the syntax field:term or field:“search term”.

For example, if you wanted to search for items where Avraham Maimonides is mentioned in the descriptions, you would use description:avraham.

Note: single word terms (e.g. avraham) don’t need to be in quotes, but any term with a space in it (such as “avraham maimonides” or a shelfmark like “T-S 10J12.16”) must be in quotes, or the search function will treat it as an OR search.

You can search the following fields (note that the field names are case sensitive):

  • pgpid – the PGPID is each document’s unique identifier in our database
  • shelfmark – a shelfmark is a locator; indicating where the physical manuscript is held
  • collection – collections, often indicated in shelfmarks, tell us more about where an institution holds the physical item (some libraries have multiple collections)
  • description – the description of each document
  • transcription – transcription means a digital rendering of the document text in the original language (transcription:אברהם will pull all of the documents where someone named Avraham is mentioned in the text)
  • tags – tags are keywords added by our scholars to describe attributes about a document (such as #india or #complaints). Note that our tagging system is inconsistent and incomplete.
  • language_code – this indicates the language of the document (language_code:he will pull up all the Hebrew language documents, ar=Arabic, jrb=Judaeo-Arabic)
  • input_year – this indicates the year the document was added to the PGP

You can combine searching fields with other search syntax, including quotation marks (description:“avraham maimonides”), or Boolean operators (description:“avraham maimonides” AND tag:petition).

RegEx (Regular Expression) Search

What is RegEx?

RegEx (Regular Expression) is a type of computer search language with specific rules – it allows for powerful and advanced searching for researchers engaging with the Geniza texts in their original languages.

RegEx allows for two types of search: exact matching (what you type in the search bar will match the results precisely) and wildcard search (where adding specific punctuation finds words or phrases with letters missing from your search term).

Please note: Hebrew script search terms will ONLY give you results in Hebrew script. Likewise for Arabic script. Searching in Arabic will not give you Judaeo-Arabic results and vice versa. Use General Search if you want to take advantage of our Arabic to Judaeo-Arabic converter.

Also note: The punctuation marks in non-English keyboards may not work – troubleshoot by switching back to an English keyboard for characters that aren’t letters.

Why would I use it?

Can you read most of a Geniza document but are stuck on a particular word or letter that just doesn’t make sense? Use RegEx to help find similar words that other scholars have already transcribed.

Common use cases:

  • Who is the Moshe b. M in your document? Find all the different different Moshes whose fathers’ names begin with M by searching משה בן מ to get משה בן מבשר, משה בן מימון, etc.
  • You can see the letters shin and resh, but it’s unclear what’s in the middle:
    • Search with a period between (ש.ר) them to see possible options with just one letter in between: שטר, שפר, שקר
    • If you’re dealing with more than one letter missing, you can guess at the range of missing characters. For example, to get up to 5, search ש.{1,5}ר to get similar results to above plus שריר, שעל נהר, שי אחר, etc.
      • Note that spaces are considered characters, so if you put a high number for the end of your range, you’ll get a long string of text in the results.
  • You think you’ve found a common phrase but can’t read the words in the middle:
    • Search שהדות .+ דחתומות to get שהדותא דהות באנפנא אנן שהדי דחתמות

And many more. Keep reading below for a full guide to exact and wildcard search, and to see example searches in Hebrew and Arabic scripts.

RegEx instructions and examples:

Exact matching — If you’re looking for an exact match in the transcription, type in the phrase you’re looking for without punctuation.

  1. Ex. searching for שטר will give you שטר ,אלשטר, שטרות, etc.
  2. Ex. Searching for ר מימון will get you ר מימון, דאר מימון, בר מימון
  3. Ex. to look for different Moshes whose fathers’ names begin with M, search משה בן מ to get משה בן מבשר, משה בן מימון, etc.
  4. Ex. searching for كتب will give you كتب، الكتب، كتبه , etc.

Wildcard searching — Punctuation marks and certain symbols work together and separately to retrieve search results beyond exact matches.

  1. If you can’t read a letter in a word, you can use the period in its place to search for likely readings. If you can’t read two letters, you can use two periods.
    1. Ex. searching ש.ר will return results including שטר, שפר, שקר
    2. Ex. searching for ك.ب will return results with the كتب root but also adjacent words where the ك and ب are separated, like مالك بن ناصر
  2. If you’re looking for a word with several missing letters, use curly brackets to search for a specific number of characters in between your search terms. Use it with the period for best results.
    1. Ex. if you can see there are zero to three letters missing in the middle of a word or phrase and want to find potential options for what it could be, search for שה.{0,3}ות to get שהדות, נעשה רשותו, בשהותהא, etc.
    2. Ex. searching for اطال .{4} بقاه will return اطال الله بقاه
  3. If you don’t know how many characters are missing, use a period followed by the plus sign (.+). The plus sign means “match the preceding letter(s) 1 or more times.” Note the right-to-left typing direction in the examples below.
    1. Ex. to find phrases with multiple words in between, search שהדות .+ דחתומות to get שהדותא דהות באנפנא אנן שהדי דחתמות.
    2. Ex. Searching for اطال .+ بقاه will return اطال الله بقاه
  4. The question mark means “optional.”
    1. Ex. if you do not know whether someone is going by Avraham or Ibrahīm, you can search for אברה.?ם to indicate there’s an optional yud missing.
    2. Ex. if you do not know if an author spelled the city of Buṣīr as one or two words, you can search بو?.صير to see results for both بوصير and بو صير (there are currently no documents that spell Buṣir as two words, but I saw it as an alternate spelling in the Places section and it’s the best example I could think of…)
  5. Square brackets mean “any of a specific set of characters.” Like the question mark search above, but you can also phrase the query so that only a certain option shows up.
    1. Ex. if you want to look for Avraham and Ibrahīm, you can search for אברה[י]?ם to indicate that the yud is optional, but is also the only letter that can be supplied there.
    2. You can also list multiple potential characters in the brackets, such as if you’re looking for both Hebrew and Aramaic spellings of דחתימין/דחתמות by searching דחת[י]?מ[יו] (i.e. the yud is optional and either a yud or vav will follow the mem).
  6. If you are looking for any of the punctuation marks listed above (i.e., the characters that the RegEx search uses) to actually show up in the transcription, use the backslash to indicate you are looking for that character (not the use of it as an operator in the search). This also applies to spacebar (i.e. if you need to add a space before a word to denote that a search term is a standalone word and not the end of a word)
    1. Ex. If you want to find transcriptions where the editor used a question mark, you can search \?
    2. Ex: Searching for ובר א where you want to make sure the search treats ובר as its own word, just search ובר א \.