Project 4 - Will Power
On Shakespeare:
William Shakespeare is deserving of our praise, methinks. A pioneer of prose and verse, his writing has endured the trials of time. The themes in his works have spoken to audiences for more than four centuries, and his influence is still felt today in literature, drama, and poetry.
What better writer could one pick to test the limits of Natural Language Processing?
I turned to the sonnets, a group of 154 structured poems authored beginning in 1599 and released in 1609, along with the long form poem A Lover’s Complaint. Using BeautifulSoup, I scraped all 154 poems, along with 154 contemporary translations just in case Shakespeare’s portfolio proved to be too dense or too subtle.
I was thrilled to find I wouldn’t need them.
On Sonnets:
The sonnet was not a new construction, even in Shakespeare’s time. The sonnet had emerged nearly 400 years prior in 13th century Italy. Sicilian poets were inspired by pastoral provencal love poems and created their own poetic structures with intricate rhyme schemes.
In the following centuries, the sonnet would continue to evolve. By the time Shakespeare embarked on his poetic pilgrimage, a new English sonnet tradition had been in place for more than half a century, pioneered by Sir Thomas Wyatt and Henry Howard, Earl of Surrey. These English sonnets were extremely rigorous in rhyme, meter, and verse.
Rather than wilt within these poetic boundaries, Shakespeare’s sonnets burst forth like blooms. Shakespeare leverages the inherent structure of the form, rather that fighting against it. His body of sonnets is a triumphant expression of cunning composition.
On Structure:
Elizabethan sonnets have three strict conventions to uphold:
Verse
Sonnets are arranged in 4 stanzas or sections. Stanzas 1-3 are quatrains (four lines), and stanza 4 is a couplet (two lines).
Rhyme
The rhyme scheme is regular throughout the stanzas. In each quatrain, the rhyme follows an ABAB pattern, repeating two more times with different rhymes. The concluding couplet has a matching rhyme (AA).
Meter
The sonnets, along with a large portion of William Shakespeare's total corpus, are written in poetic meter. Will's meter is particularly rigorous, mostly constructed in iambic pentameter, meaning there are 10 syllables in each line, grouped into 5 groups of 2 syllables (feet), one stressed, one unstressed. If you can think of the musical theme to The Pink Panther, that is an iambic foot.
We can see all three parameters at work in one of Shakespeare’s most iconic sonnets:
Sonnet 18
Verse | Rhyme | Text (with Meter) |
---|---|---|
Stanza 1 - Quatrain | A | Shall I compare thee to a summer's day? |
B | Thou art more lovely and more temperate: | |
A | Rough winds do shake the darling buds of May, | |
B | And summer's lease hath all too short a date: | |
Stanza 2 - Quatrain | C | Sometime too hot the eye of heaven shines, |
D | And often is his gold complexion dimm'd; | |
C | And every fair from fair sometime declines, | |
D | By chance or nature's changing course untrimm'd; | |
Stanza 3 - Quatrain | E | But thy eternal summer shall not fade |
F | Nor lose possession of that fair thou owest; | |
E | Nor shall Death brag thou wander'st in his shade, | |
F | When in eternal lines to time thou growest: | |
Stanza 4 - Couplet | G | So long as men can breathe or eyes can see, |
G | So long lives this and this gives life to thee. |
Using NLP:
While our algorithms can group our terms into like categories, it is our job as readers and data scientists to interpret their meaning. To process the poems, I utilized the term frequency—inverse document frequency (TF—IDF) vectorizer and ran the text and vectorizer through a non-negative matrix factorization (NMF) model. After iterating through a range of different numbers of categories, the best fit for our poems was a partition of four categories. Below, you will find the key words for each category, labeled with my interpretation of their shared subject.
Category 2: Illusion and Perception
heart, eyes, eye, hearts, sight, face, looks, picture, thoughts, right
Category 1: Loves Thrall
love, loves, true, hate, new, prove, dear, sweet, soul, sake
Category 4: The Passage of Time
time, world, life, times, make, earth, night, day, happy, away
Category 3: Objective Beauty
beauty, praise, fair, muse, sweet, beautys, old, days, truth, worth
Applying Categories:
With these categories from our model, we can go back and assess each poem on how well it falls into a given category. From this, we can begin to see more than just a collection of love poems, but an exploration of the deeper and subtler features of love.
Sonnet 18:
Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
Sometime too hot the eye of heaven shines,
And often is his gold complexion dimm'd;
And every fair from fair sometime declines,
By chance or nature’s changing course untrimm'd;
But thy eternal summer shall not fade,
Nor lose possession of that fair thou ow’st;
Nor shall death brag thou wander’st in his shade,
When in eternal lines to time thou grow’st:
So long as men can breathe or eyes can see,
So long lives this, and this gives life to thee.
Category score:
Love 0.0
Illusion 0.02992
Beauty 0.18229
Time 0.07058
While a bit odd at first glance, we can begin to see these categories at play when we return to Sonnet 18. Beauty is prominently featured in mentions of loveliness and fairness, time is expressed in the passage of seasons from summer, and illusion makes an appearance at the end expressing perpetuity of love.
Conclusion:
It is very surprising as a contemporary analyst to see how perceptibly clear-eyed Shakespeare was in his writing. For the common reader, the writing can be almost overwhelmingly dense. However, when analyzing the work programmatically and as a whole, we begin to see exactly how each piece feels like a diamond in a crown: rigidly structured, intricately multifaceted, and joyously resplendent. And though he may not be around to wear it anymore, Bill still deserves it for his contributions to literature and to the English Language as a whole.