FAQ

THE CAIRO GENIZA

What is the Cairo Geniza?

The Cairo Geniza is a cache of roughly 400,000 manuscript fragments that survived in the Ben Ezra Synagogue in Cairo.
For more, see What Is the Cairo Geniza?

What kinds of texts did the geniza preserve?

An estimated ninety percent of geniza fragments come from long-form literary texts, including liturgy, Hebrew Bible, rabbinic literature, philosophy, medicine, astronomy, astrology, Jewish law, lexicography, poetry and theology.
The remaining ten percent are documents: letters, legal deeds, lists, accounts, state documents and other everyday writings and ephemera.

Where were geniza texts written?

While all the texts survived in Egypt, many were written in other places across the Mediterranean and Indian Ocean basins, including the Iberian peninsula, North Africa, Sicily, Syria, Yemen, Iraq, Iran and India.

When is the material from?

The majority of geniza documents date to the period between 950 and 1250 CE.
There are also sizable clusters from the sixteenth and nineteenth centuries.

Where are the contents of the geniza now?

Geniza documents can now be found in more than sixty libraries and private collections in Europe, North America and the Middle East.

For more, see Resources.

THE PRINCETON GENIZA PROJECT

What is the Princeton Geniza Project?

The PGP is a database devoted to the documentary geniza fragments, which make up ten percent of the whole Cairo Geniza cache.
The PGP database also includes documents preserved outside the Ben Ezra Synagogue, in some cases because they're similar to Geniza documents in form and formulae and come from the same historical contexts (such as eleventh-century Egypt or twelfth-century Sicily), and others because during the late nineteenth and early twentieth centuries, dealers, collectors and libraries presumed the provenance of many medieval documents in Hebrew script to be the Ben Ezra Synagogue, but it's in fact unknown.

How did the project begin?

The PGP was founded in 1986 by Mark R. Cohen and A. L. Udovitch to digitize transcriptions of geniza documents. The first 2,200 transcriptions they uploaded were those of their teacher S. D. Goitein (1900–85).
Over the decades, the PGP has come to include transcriptions by many other researchers, as well as descriptions and research aids.
Version 4.x of the PGP, launched in 2022, includes high-resolution images.

What is the Princeton Geniza Lab? Is it the same as the PGP?

The PGP is a longstanding project of the Princeton Geniza Lab, but the Lab now houses other projects.
For more, see History of the Princeton Geniza Lab and the Lab’s Projects page.

For whom is the Princeton Geniza Project intended?

The PGP began as a resource for professional geniza researchers, but we’ve recently overhauled it to make it more accessible to non-specialists, including students and the public.
Anyone interested in the social and economic history of the medieval and early modern Middle East and its Jewish communities can benefit from the PGP.
Other disciplines can make use of our perspectives, resources, and approaches, e.g., other digital humanities projects, linguists, open source advocates, librarians, archivists and software engineers.

What are the aims of the PGP?

to provide access to geniza documents and the interim products of scholarship on the geniza, including unpublished notes and transcriptions of scholars in the field;
to facilitate access to documents in order to fuel research into premodern global history;
to capture all the documentary texts from the Cairo Geniza and related caches, such as the Ottoman and modern Jewish community archives in Cairo.

How are PGP entries structured?

PGP records include five kinds of information:
- Classifications. Each document is titled with a shelfmark (the call-number in the collection where it’s housed) and classed into one of six types: legal document, letter, list or table, paraliterary text, or state document.
- Descriptive information. Two-thirds of our entries have detailed descriptions of the document’s contents. Many also have #tags, but tags aren’t comprehensive; they merely represent the interests of the researchers who have done the tagging.
- Images. We currently display images from two collections: Cambridge University Library and the Jewish Theological Seminary. The images are displayed in conformity with the International Image Interoperability Framework (IIIF). As more geniza-holding institutions adopt the IIIF, we will add their images to our site.
- Transcriptions. Because it can be challenging to read the handwriting of medieval scribes, scholars produce typewritten copies that can be read easily and searched digitally. Transcriptions are also referred to as scholarly editions. The PGP currently has 3,707 transcriptions, with more to come.
- Scholarship records. Our records list who has transcribed the document (as well as whether the transcription has been published, and if so, where). They also list the published books and articles or unpublished notes from which we have derived the information in our descriptions.

Can you briefly describe your data model?

First: what is a data model? A data model is a way to organize different types of data (some examples in our case: documents, fragments, images and descriptions), and to standardize how they relate to each other.
At the core of our data model is a many-to-many relationship between physical fragments and the textual units that we call documents. A single fragment can contain multiple documents, as when a scribe used the blank back of a page to write another text). Conversely, a single document can be written across multiple fragments, as when a text was torn and the pieces now have different classmarks, and/or are in different libraries.

Which philological transcription conventions do you follow?

The PGP has followed varying sets of transcription conventions over its long history, sometimes reflecting the choices of the text-editors whose editions we have digitized.
We instituted a transcription reform in 2021, but we've decided not to apply the new conventions retroactively, at least for the moment.
Here are both the old and new PGP transcription conventions.

How do I get involved with the PGP?

We have a team of dedicated and talented researchers who have come to us from many directions. They include undergraduates, graduate students, postdocs and faculty at Princeton and other institutions, as well as teachers, librarians and other professionals interested in Judaic and Islamic studies.
If you would like to contribute information to the PGP, we’ll soon be adding links to document records through which you can add suggestions. In the meantime, please contact us.
If you would like to work for the Princeton Geniza Lab, write to us. We welcome inquiries from students and researchers, as well as data specialists, software developers and machine learning experts with ideas for specific research projects or modules.

SKILLS

What languages do I need to know to read geniza documents?

We are including more and more English translations in the PGP.
If you want to read documents in the original, the main language you need is Judaeo-Arabic (Arabic written in the Hebrew alphabet).
There are also many documents in Arabic, Hebrew, Aramaic, Ladino and Persian.
In many medieval documents, there are numbers written in Coptic.
For more on the languages that are useful for studying geniza fragments, see Resources.

Wait. This Hebrew doesn’t look like the Hebrew I learned how to read. What gives?

Modern, printed Hebrew is derived from square script, a formal register of Hebrew that geniza scribes wrote when they were getting paid to copy a text or trying to impress their readers. Scribes writing for everyday purposes tended to use more informal handwriting, with varying degrees of cursiveness.
Learning to read premodern handwritten Hebrew is a skill that can be mastered with time and motivation. (You might want to start with this aleph-bet chart created by Laura Newman Eckstein for a Zooniverse project we helped with, Scribes of the Cairo Geniza.)

Why are there so few dots in the Arabic documents, and how on earth do you expect me to make sense of these scribbles?

Documentary Arabic hands are notoriously difficult to decipher. These are some of the challenges:
- a dearth of canonical dots, which renders many letter-shapes ambiguous; the sporadic dot phenomenon inspired this classic study
- scribes’ reluctance to lift the pen, which created abusive ligatures, or strokes connecting letters that in standard Arabic writing should remain unconnected
- Verschleifung (literally, “slurring”) pen-strokes so that letters are skipped or subsumed into other letters; for instance, اربع is often written لع (with Verschleifung of the letters and an abusive ligature after the alif)
The good news is that most geniza documents are in Hebrew script, which is rarely as cursive as Arabic (but there are exceptions, among them the handwriting of Yehuda Halevi and Moses Maimonides).

Which additional skills do I need to study the geniza documents?

Patience and spreadsheets.
Patience is essential because the texts are dispersed across dozens of institutions and they aren’t all cataloged.
Good record-keeping is essential because there are thousands of documents in no particular order. If you have worked in an archive, you’ve had the luxury of someone else creating order before your arrival. If you work with geniza fragments, you’re often assembling your own archives (or dossiers, which is the technical term to use when the material you’re assembling wasn’t actually archived). We use lots of spreadsheets.
Other technical skills one can expect to pick up include understanding how legal documents are structured, recognizing scribes by their handwriting, recognizing the names of coins and units of weights and measurement and learning the patterns of shelfmarks in dozens of library collections.

For more, check out our Resources page!

IS IT GENIZA OR GENIZAH?

Both are correct. The two spellings reflect differing ideas about how to transcribe the Hebrew feminine ending ָ–ה (also found in Torah, Mishnah and matzah).
We follow S. D. Goitein in writing geniza. His transliteration style was in turn probably influenced by Arabic transliteration, which has the same debate over the letter ـة but tends more commonly not to use the -h.
For more, see our Conventions for Transcribing Arabic and Hebrew to Latin Script.