A first year in texts

I have some long-overdue news to share! I mentioned in this post last August that things had been going well with one of my OkCupid dates. They still are! We’ve been in a relationship for over a year now, and I’m still crazy about him.

Recently, I exported all of our texts to csv to see if I could find anything interesting. Here’s what I found. I’ll use the results to share a little more about our relationship.

Year Overview


Click image for interactive graph

As you can see from this graph, our relationship moved rather quickly. We moved in together less than four months after we started dating. The graph has two major peaks: one at the beginning, and one the week when we were moving in together, when we were figuring out all the necessary details. Not surprisingly, after we moved in together, we texted a lot less. Since then, we tend to exchange around 10 to 40 texts a week. I suspect this pattern will continue in the future.

Who texts the most?


Every month, I consistently sent fewer texts than he did. On average, he sent 26 texts per week, while I sent 22.


My texts, however, were generally longer. Mine were 53 characters long on average, while his were 50.

Most Common Phrases

Him Me
Phrase Count   Phrase Count
I miss you 19 i miss you 12
u got it babe 6 i’m not sure 10
i hope you are having a nice day 4 just leaving work now 7
how are you going 4 on my way home 5
how is your day going 3 have a good sleep 5
you got it babe 3 i miss you too 5
at the library 5
i’m just heading home 4

From this you can see we miss each other frequently and I’m often on my way home, unsure or at the library. I find it amusing that “babe” was one of his most common words, when he would never call me that in person.

Smileys and Sad Faces

Number of texts containing a smiley or sad face

Type Him Me
🙂 74 67
😦 26 8

Number of texts containing multiple smiley or sad faces

Type Him Me
🙂 1 8
😦 0 0

We sent a fairly similar number of smiley faces, but he sent more than three times as many sad faces as I did. This was somewhat surprising to me because I think I’m gloomier than he is. I was much more likely to send multiple smiley faces in one text.

To conclude, his texts are smilier and more frequent. Mine are longer and more likely to contain multiple smileys. What does all this tell us about our relationship? Perhaps not a whole lot, but I at least enjoyed putting it together.

The Code

Here is the R code I used to create most of this analysis. I used Jihosoft to export my text messages as a csv and I used the text analyzer at online-utility.org to find the most common phrases.


#Read in text data
 texts <- read.csv('C:/Users/Documents/Project Support/Blog/2017/07 July/sms_20170720.csv')

#Convert dates from factor to date
 texts$Date <- as.Date(texts$Date, format = "%d/%m/%Y")

#Get month and week of text
 texts$Month <- as.Date(cut(texts$Date, breaks="month"))
 texts$Week <- as.Date(cut(texts$Date,breaks="week",start.on.monday = FALSE))
 texts$Count <- 1

#Change in/out to him/me
 texts$Sender[texts$Type=="in"] <- "Him"
 texts$Sender[texts$Type=="out"] <- "Me"

### Graph count of texts ###
 #Get count of texts by month and week
 CountByMnthSender <- aggregate(Count~Month+Sender,texts,sum)
 CountByWeekSender <- aggregate(Count~Week+Sender,texts,sum)
 CountByWeek <- aggregate(Count~Week,texts,sum)

### Graph by Week
 green <- rgb(4,185,56,max=255)
 blue <- rgb(0,195,189,max=255)
 coral <- rgb(242,119,111,max=255)
 grey <- rgb(89,89,89,max=255)

##Color Bars
 CountByWeek$Color <- grey
 #First Sleepover
 #Moved in Together
 #Kaikoura Earthquake

##Change Hovertext
 CountByWeek$MyText <- ""
 #{Decide what to call first sleepover}
 #First Sleepover
 CountByWeek$MyText[CountByWeek$Week=="2016-08-07"]<-"First Sleepover"
 #Moved in Together
 CountByWeek$MyText[CountByWeek$Week=="2016-10-16"]<-"Moved in Together"
 #Kaikoura Earthquake
 CountByWeek$MyText[CountByWeek$Week=="2016-11-13"]<-"Kaikoura Earthquake"

p2 <- plot_ly(CountByWeek, x = ~Week, y= ~Count, type = 'bar',text=~MyText, marker = list(color = CountByWeek$Color) )%>%
 layout(title = "Texts Per Week")

##create legend
 #empty plot
 plot(1, type="n", axes=FALSE, xlab="", ylab="")

 c("{First Sleepover}", "Moved In Together","Earthquake"),
 col=c(blue,coral,green), pch = c(15,15,15),
 inset = .02, bty = "n")

#Graph by month and year
 ggplot(CountByMnthSender,aes(Month,Count,fill=Sender)) +
 geom_bar(stat="identity",position='dodge') +
 ggtitle("Texts Per Month") +
 theme(plot.title = element_text(hjust = 0.5))

### Graph average length of texts ###

#Get text length
 texts$Length <- nchar(as.character(texts$Message))

#Get average text length per month
 LengthByMnth <- aggregate(Length~Month+Sender,texts,mean)

ggplot(LengthByMnth,aes(Month,Length,fill=Sender)) +
 geom_bar(stat="identity",position='dodge') +
 ggtitle("Average Length of Texts Per Month") +
 theme(plot.title = element_text(hjust = 0.5))

###Calculate number of texts per week per person
 TxtsPerWeek <- aggregate(Count~Sender,CountByWeekSender,mean)
 AvgTxtLength <- aggregate(Length~Sender,texts,mean)

###Count texts with :) and :(
texts$Smile <- grepl(":)",texts$Message,fixed=TRUE)
texts$Frown <- grepl(":(",texts$Message,fixed=TRUE)
SmileTxts <- aggregate(Smile~Sender,texts,sum)
FrownTxts <- aggregate(Frown~Sender,texts,sum)

##Count texts with multiple :) or :(
texts$SmileCount <- str_count(texts$Message,":[)]")
texts$MultiSmile <- texts$SmileCount > 1
texts$FrownCount <- str_count(texts$Message,":[(]")
texts$MultiFrown <- texts$FrownCount > 1
MultiSmile <- aggregate(MultiSmile~Sender,texts,sum)
MultiFrown <- aggregate(MultiFrown~Sender,texts,sum)

Awkward Celebrity Couples

In this post, we’ll take a look at how some famous couples stack up against the age rule of thumb, mentioned in an earlier post. For reference, here are the equations:Dating range calculationsAccording to the rule of thumb, dating someone older than your max or younger than your min would be considered objectionable.

In the graph below, each line represents a relationship. If the line falls within the blue zone, the age difference of the couple was socially acceptable for that portion of their relationship.


If the graph is confusing to read, hopefully the following diagram helps:How to Read the GraphFor example, looking at the pink solid line for Demi Moore and Ashton Kutcher, the coordinates for the circle are [42,27], so they got married when Demi was 42 and Ashton was 27. The pink triangle at the other end of the line means the relationship ended in divorce. The coordinates for the triangle are [51,35], so their relationship ended when Demi was 51 and Ashton was 351.

Hugh Hefner

Hugh Hefner, the king of icky relationships, almost made it into the zone of social acceptance with his 20 year relationship to second wife, Kimberly Conrad, before their divorce. In his current marriage, 90-year-old Hefner would need to stay alive and married to 30-year-old Crystal Harris until he’s 134 and she’s 74 for the couple to cross into the blue zone.

Woody Allen and Soon-Yi Previn

The 35 year age difference between Woody Allen and his wife, Soon-Yi Previn, isn’t the only thing creepy about this relationship. Woody first became involved with then 21-year-old Soon-Yi when he was still in a relationship with Soon-Yi’s adopted mother, Mia Farrow. He had even adopted some of Soon-Yi’s younger siblings. 


At age 40,  Sun Myung Moon, the controversial founder of the Unification Church (the church I was raised in) married his wife when she was just 17. During their 52 year marriage, they had 13 children with varying degrees of craziness, the latest iteration being their youngest son’s arms manufacturing business with endorsement from Donald Trump’s son (read more here).

Donald Trump

Surprisingly, of all the couples, Donald and his wife Melania have the least objectionable partnership, at least based on age alone. They’ve been safely within the blue zone for most of their relationship. Perhaps what makes them an awkward couple is their mismatched levels of attractiveness. (Or because everything about Trump is objectionable.)

Demi Moore

Demi (Guynes) Moore married Freddy Moore when she was 17 and he was 29. Later she switched sides, marrying 27 year old Ashton Kutcher when she was 42. For most of their relationship, the age difference between Demi and Ashton was very close to the blue zone. This could suggest that their awkwardness as a couple was because she was an older female dating a younger male, rather than their relative age difference alone, further evidence that the rule of thumb could use some adjusting.

1. Note: Ages are estimated from Wikipedia, which often only lists the year of marriage or divorce, rather than the exact date. Exact ages at marriage or divorce may be slightly off because of this.

The Code


age_plot() #see "Calculate Your Dating Age Range" post for code

dark_blue <- rgb(68,84,106, max = 255)
blue <- rgb(96,147,125, max = 255)
yellow <- rgb(217,192,7, max = 255)
purple <- rgb(122,98,145, max = 255)
pink <- rgb(247,190,202, max = 255)
hot_pink <- rgb(201,6,45, max = 255)

#plot couples

####Hugh Hefner and Kimberly Conrad
add_seg(26,63,47,84, end = "divorce", col = dark_blue, 
     lty = 'dashed')

####Hugh Hefner and Crystal Harris
add_seg(26,86,30,90, end = "", col = dark_blue)

####Woody Allen and Soon-Yi Previn
add_seg(26,61,45,80,end="", col = purple)

####Rev. and Mrs. Moon
add_seg(17,40,69,92, end = "death", col = yellow)

####Donald Trump and Melania Trump
add_seg(35,58,46,69,end="",col = blue)

####Demi Moore and Freddy Moore
add_seg(17,29,22,34, end = "divorce", col = hot_pink , 
     lty = 'dashed')

####Demi Moore and Ashton Kutcher
add_seg(42,27,51,35,end = "divorce", col = hot_pink)


add_seg <- function(x1,y1,x2,y2,end="",...){
     segments(x1,y1,x2,y2,lwd=1.5,...) #plot line segments
     points(x1,y1,pch=16,...)          #plot left endpoint
     #add endpoint
     if (end == "divorce") points(x2,y2, pch = 17,...)
          else (
               if (end == "death") points(x2,y2, pch = 15,...)

celeb_legends <- function(){
     #empty plot
     plot(1, type="n", axes=FALSE, xlab="", ylab="")
     #main legend
          c("Hugh Hefner and Crystal Harris",
          "Hugh Hefner and Kimberley Conrad",
          "Woody Allen and Soon-Yi Previn",
          "Sun Myung and Hak Ja Han Moon",
          "Donald and Melania Trump",
          "Demi Moore and Freddy Moore",
          "Demi Moore and Ashton Kutcher"), 
          lty=c(1,2,1,1,1,2,1), lwd=2,
          inset = .02, bty="n")
     #endpoints legend
          c("marriage", "divorce","death"), 
          col=hot_pink, pch = c(16,17,15), 
          inset = .02, bty = "n")


Where are all the single men?

According to an interactive map from Jonathan Soma, they’re everywhere.

Single men outnumber single women across the country up until age 35.



From age 35 onward, the balance starts shifting toward more single women until about age 50, where single women strongly outnumber the men.


Play around with the interactive version here and see for yourself. You can adjust the slider to change the age range.

Note: From what I can tell ‘single’ here means ‘never been married’, rather than ‘not currently in a relationship’.


Robin Weis, the same girl who brought us 8 years of dating data, tracked her crying patterns for 589 days, rating them on a scale from ‘a tear or two’ to ‘I am a crumpled pile of flesh’. She cried on 216 of those days. And I thought cried a lot.

 Number of Cries Per Day


She categorized each cry into 8 general categories, shown in the graph below. The mound of purple life-related cries on the left side was largely during a 10 week trip to Europe. A large proportion of her cries were breakup and relationship related, which included finding out her boyfriend was married. Yep, that’s bound to cause some tears. Check out her full post here.

Categories of Cries Over Time


What I’ve found interesting

  • Relationships and breakups appear to cause a lot of negative emotions. It’d be interesting to see what a graph of positive emotions due to relationships would look like, though that would be harder to quantify. Crying generally has an obvious beginning and end, but how would you track your start of happy feelings and end of happy feelings as precisely?
  • Is travel-crying a thing? Robin mentions that almost 20% of her crying occurred while she was traveling solo. I had a similar experience recently when I was in South America. It can be particularly uncomfortable if you’re staying in a hostel and there’s nowhere private where you can just go and cry. Has anyone else experienced this?

As I’m taking my first steps into the world of online dating (I just signed up to OKCupid for the first time) it can be a little scary seeing the amount of angst relationships can cause. Might I be more comfortable staying safely single?

I hate to be yet another blog that touts the benefits of travel, but my experience with travel is relevant here. Even though I spent a lot of time being unhappy while traveling, overall it was a rewarding experience that enhanced my life and I’d do it again. As for dating, it’s a risk I’m willing to take.