Remember me

Linking Crimes, Part II

May 2, 2012

                In the first part of this article, published yesterday, I outlined a method to "code" a murder, toward the greater goal of determining whether two crimes are related.   In this section, the second and final part of the article, I will outline a way that we can measure the similarity of codes.

                Let us begin with the codes that I outlined yesterday, and in a minute I’ll re-explain what all of this gibberish means:

Bates

Cheri Jo

F

4

C

DM13hx

U

1

S

2

S

1

1966

-

303

C

GJ

Borden

Abby

F

7

C

FN41kq

D

1

S

10

A

1

1892

-

217

A

 

Borden

Andrew

M

7

C

FN41kq

D

1

S

10

A

1

1892

-

217

A

 

Brandley

Robbin

F

4

C

DM06gs

U

1

S

4

A

2

1986

-

18

C

 

Davies

John

M

3

C

CM87um

X

X

X

2

X

1

1981

-

311

D

 

Gilbert

Robin

F

3

C

FN42km

U

1

X

2

S

1

1975

-

183

C

P

Masgay

Louis

M

6

C

FM29kw

R

2

M

1

R

3

1981

-

182

B

 

Moxley

Martha

F

3

C

FN31ea

U

1

B

4

S

1

1975

-

303

C

P

O'Keefe

Michelle

F

4

C

DM04vq

V

1

M

1

S

1

2000

-

53

C

 

Parkman

George

M

6

C

FN42li

B

2

B

7

A

1

1849

-

327

B

 

Smart

Gregg

M

5

C

FN42iv

D

1

G

1

X

1

1990

-

121

C

 

Stine

Paul

M

5

C

CM87ss

V

1

G

1

R

2

1969

-

284

C

J

 

                For reasons that will shortly become apparent, I’m going to move the two "long" elements of this code—the time and the place—to the far right of the code.    These are the re-edited codes, with an abbreviated legend:

CrimeLink_1

 

Column 1--Last Name

Column 2--First Name

Column 3--Gender

Column 4--Age range (1--infant; 8--old person)

Column 5--Race (C--Caucasian, A--African American, O--Oriental)

Column 6--Was the body found inside or outside, specific codes given yesterday.

Column 7--Was the Body found where the crime occurred, Yes (1) or No (2).

Column 8--Cause of Death (codes given yesterday)

Column 9--Instrument of Death (codes given yesterday)

Column 10--Apparent Motive (Anger, Robbery, Sexual Assualt, X/Unknown.)

Column 11--Risk category of victim (1--Low, 3--High)

Column 12--Time of murder (A--morning, B--afternoon, C & D--Night)

Column 13--Codes for Signature Elements (List given yesterday)

Column 14--Location where body was found (Maidenhead Grid System)

Column 15--Year in which murder was committed.

Column 16--Julian Date on which murder was committed.

 

                Here’s how I propose to score the similarity of two codes.   In the first part, assessing columns 3 through 13 (and with an interloping step) we will ADD points for "matches" between the codes.  In the second part, we will reduce this score for significant separations in time and space between two cases.

                Step One:  If the genders of the two victims is the same start with 100 Points.  If it is not the same, start with zero points, unless the motive in both cases appears to be robbery.    If the motive in both cases appears to be robbery, score the similarity as 100 if the gender matches, and 75 if it does not.

                Step Two:  If the two victims are in the same age cadre (Column 4), add 100 points.   If they are not in the same age cadre, subtract 25 points for each (one) difference in the code, but not less than zero.

                Step Three:  If the race of the two victims is the same, add 25 points.  (It was for many years believed that all serial murderers were race-specific.  This is now known to be untrue, and really, there was never any reason to believe that it was true.  If the race of two victims is different, it is noted but is not a big deal.)

                Step Four:  Column 6 represents whether a body was found in a Dwelling (D), another building of some kind (B), in an Urban area (U) or a Rural area (R) or a Vehicle (V).   If these codes match, add 50 points; if they do not match, add zero.   However, if either code is "X" (Unknown), add 25. 

                Step Five:  Column 7 represents whether the body was found where the person was killed (1) or was moved after death (2).   If these two match, add 30 points.   If one or the other is a "3" (unknown), add 15 points.   Otherwise, no points.

                Step Six:  If the code representing the cause of death for the two victims is the same, score that as 100 points.   If it is not the same, that generally is scored as zero.  

                However, if the code for one victim is "G" (Gunshot) and the other is "M" (Multiple gunshots) also enter that as "100".   Also, if the code for either victim is "X", enter that as "50".  

                Step Seven:  If the codes for cause of death and instrument of death BOTH match, add 100 points.   "G" and "M" are NOT considered matching codes on Step 7.   If the cause of death of either victim is unknown, add 50 points.   If the causes of death for both victims are known and there is anything other than a perfect match in Column 8 OR 9, zero points.

                Step Eight:  If the codes for the apparent motives match, add 75 points.  If they don’t match, zero points.   (In this case, no points for unknown apparent motives.)

                Step Nine:   Column 11 is for the risk category of the victim (1—Low, 2—Moderate, 3—High.)   If these two entries match, add 50 points.   If they are separated by one, add 25 points.   If they are separated by two, no points.  

                Step Ten:  If the two persons are known to have been murdered in the same general time of the day (A, B, C or D), add 25 points.   If both were killed in daylight (but one in the morning and the other in the afternoon) or if both were killed at night, add 15 points.   If one was killed in daylight and the other at night, or if the time of death of one or the other is unknown, no points. 

                Step Eleven:  The codes for Signature Elements are tricky, and we may have to concede some objectivity here to recognize similarities that are too important to ignore.   However, I have two rules here:

                1.   If two victims have the same "signature code element" but the activities or conditions described by this code are not too similar, add 20 points.

                2.   If the activities or conditions described by the code are in fact quite similar, add up to 100 points.  

                In this case we have two "signature element" matches.   The murderers of both Bates and Stine wrote letters to the police after the event, describing what had happened.   The letters are very, very different; however, the basic fact is that both criminals engaged in the extremely unusual activity of writing to police and media about their crime.    That in itself is a high degree of similarity, and I’m going to score it at 85 points.  

                In the cases of Robin Gilbert and Martha Moxley, both had their clothes torn open, but they were not raped.    Again, I see that as a very substantial similarity between the two cases, and I am going to score it at 70 points.

                It’s important to fix this little glitch in the system, because we can’t really computer-generate scores for comparisons of large number of cases if we have to review the facts of the case and make a judgment about the degree of similarity.   It has to be more organized than that, more systematic.  But for now, we’ll make do with these judgments, OK?

                Step Twelve:   This is the "interloping" step to which I referred before.   The question we are asking here is, "Is there an actual relationship between two victims?"   If a husband and wife are murdered, even if they are murdered in very different ways and at different times, one has to look at those cases as possibly linked; it’s awkward and it screws up my beautiful theory of how this could work, but it’s too significant to just ignore.   So here’s the rules; it has to do not with the code for any victim, but with the relationship between the two:

                1)  If two murder victims are closely related (husband and wife, brothers, brother and sister, father and son, etc.), add 100 points to the total.

                2)  If two murder victims have some less significant relationship (former lovers at some remove in time, co-workers, more distant relatives, etc.) add 75 points to the total.

                3)  If two murder victims are acquainted at a meaningful level (members of the same church, former co-workers, went to school together and have kept in touch, etc.), add 50 points to the total.

                4)  If two murder victims are not known to be connected, but had significant mutual acquaintances (people they both definitely knew and definitely associated with, as for example two people who may have worked at a large factory at the same time), add 25 points to the total. 

                This only applies, in this case, to Andrew and Abby Borden.   Otherwise none of these people are acquainted. 

                OK, we add up all of the points for similarities and connections that are awarded in Steps 1 through 12, and we put those aside for a moment.   Then we’ll combine the "time and place" information into one "percentage" score, and we’ll multiply the total from Steps One and Two by the percentage that represents the connection in time and place, OK?

                It is apparent that a separation in place is never absolute, since people can and do routinely travel across the United States and around the world.    Ted Bundy committed murders in Washington State and in Florida.   Serial murderers are often highly mobile, and a separation in space is never a persuasive argument that two murders are not linked. 

                What we are looking for, though, is not evidence that murders are NOT linked, but a way to measure the connections between them.    If one murder is committed in Washington State and one in Florida, that is not a connection between them; they may still be connected, but this is not a connection.   

                On the other hand, time can be an absolute barrier to a connection between two murders.  The person who murdered Dr. George Parkman in 1849 absolutely cannot have murdered Michelle O’Keefe in 2000, and cannot reasonably be suspected of any murder after about 1890.   People have committed related murders more than 20 years apart, but not very often.   In general, if you’re going to link crimes that are 20 years apart, you’ll have to find some other way to make that connection, other than through this process. 

                Here’s my suggestion.  Figure how many miles there are between any two murders (which can very easily be done with the Maidenhead Grid, which is constructed for that purpose), and subtract 5 (since five miles are not any barrier to two murders being linked.)   Take the square root of this number; if two murders are separated by 105 miles, that makes 10.0, and if they are separated by a 1000 miles, that makes 31.54.   Subtract this number from 200, and divide the result by 200.   This we will call "FIGURE A".

                Second, figure the number of DAYS that separate two crimes, and subtract 2 (since a murderous rage or a murderous panic can very easily last for two days.)    Don’t worry about leap years; it’s not a real difference.   Take the square root of that number.  If two crimes are separated in time by one month (30 days) this will yield 5.29, and if they are separated by one year, this will yield 19.05.

                Subtract this number from 150, and divide by 150.    This we will call "FIGURE B".

                Now multiply Figure A by Figure B.    For illustration. …the distance between Riverside Community College, where Cheri Jo Bates was murdered, and Greenwich, Connecticut, where Martha Moxley was murdered, is 2,794 miles.   Subtract 5 (2789), take the square root (52.8), subtract from 200 (147.2), divide by 200, you’ve got .736.    That’s figure A.

                Martha Moxley was killed exactly nine years to the day after Cheri Jo Bates; that’s 3,285 days (ignoring leap years).   Subtract 2 (3283), take the square root (57.3), subtract that from 150 (92.7), divide by 150, you’ve got .618.  That’s figure B.

                Multiply Figure A times Figure B, .736 times .618; we’ve got .455.  The time-and-space connection score between these two events is 45.5%.  If the time-and-space connection score is less than zero, it’s zero.

 

The Comparison Results

                This data suggests that the two murders on this list which are most likely to be connected are the murders of Andrew and Abby Borden in Fall River in 1892.

                Of course we knew this anyway.  This is what you do when you’re developing a new test:  you put in a few obvious questions just to make sure that your system is getting the obvious stuff right.

                With a list of 12 murders there are 66 possible murder-to-murder connections within the list.   The system scores the comparison of the Andrew and Abby Borden murders at 655.0, the highest of any of the 66 comparisons.   Here are the actual scores, with the scoresheet:

 

CrimeLink_2

 

                The conclusion of this study. ..which obviously is something less than a tentative conclusion, since I’m really only trying to introduce the concept here; I’m not trying to reach any conclusions. ..but the conclusion would be that:

1)  The murders of Andrew and Abby Borden were obviously connected,

2)  The murders of Robin Gilbert and Martha Moxley stand out as dramatically more closely related than any other murders on this list, aside from the Bordens, and

3)  There is no reason, based on this analysis, to think that any other murders here were connected. 

                Obviously I could be just confirming my own prejudices here; this is something that would have to be worked out further down the road.

                There were a couple of other things that were going on in the background of this process, and let me draw those now into the light.  Robert Graysmith has written two or three books about the Zodiac murders; he is portrayed in the movie Zodiac by Jake Gyllenhall (an excellent portrayal, by the way; my take on Graysmith, having read all of his books, is exactly like Gyllenhall portrays him—an earnest, likeable person who is in way over his head taking on the Zodiac.)    Anyway, Graysmith believed—and has convinced much of the public—that the murder of Cheri Jo Bates was committed by the Zodiac, thus that it is linked to the murder of Paul Stine, which was committed by the Zodiac.

                My view is that there is no reason whatsoever to link these murders, and that the Zodiac very clearly did not murder Cheri Jo Bates, although he did claim credit for the crime after the option of doing so was suggested to him.   The murderer of Bates and the murderer of Stine both wrote letters to the police and to the media after the fact, which is relatively uncommon behavior for a murderer.   I gave the "match" between the two cases 85 points for that similarity, which is almost the same number of points I would have given had Bates and Stine been married, or had they been brother and sister.

                But even with those 85 points, the cases don’t score as notably similar; the similarity scores at 185, which is 9th on the list.   There just is very, very little reason to think that these two cases are connected.  That was what I thought going into this study, and that is what the analysis shows; there’s really not very much in common between these two.

                Another "linkage" that I was trying to work out was the connection between the murders of Robbin Brandley and Michelle O’Keefe.   Brandley and O’Keefe were both college girls. . ..somewhat similar girls, actually—clean, church-going girls from strong families who were career-focused and did not have boyfriends although they were very attractive young women.  It strikes one that they could have been friends.   They were both in the Los Angeles area, each returning from a musical event early in the night (9:00, 9:30), and both were murdered in poorly-lit parking lots, Brandley in 1986 and O’Keefe in 2000.   

                The cases sound almost identical, but we know for certain that they are not connected, or at least there is a strong presumption that they are not connected, because Andrew Urdiales confessed to the 1986 murder of Brandley and was in jail in 2000.  It is not likely that Urdiales is lying.   I put those two cases in the package, then, to test whether the system would score them as highly similar.    If the cases of Brandley and O’Keefe had scored as highly similar, like those of Moxley and Gilbert, then that would demonstrate that such a score could easily be a coincidence.

                I will tell you sincerely, and you can take it for what it is worth, that I did believe that the deaths of Brandley and O’Keefe would score as highly similar, like the deaths of Moxley and Gilbert.   The system, from my standpoint, hit a home run on this one.   Quite to my surprise, the system distinguished clearly and easily between Brandley and O’Keefe’s murders, despite the apparent similarities between them.   The similarity of the two cases scored at only 151, 17th on the list of 66 comparisons—whereas the similarity between the murders of Moxley and Gilbert scored at 540, second on the list—and actually a fairly close second—to the murders of Abby and Andrew Borden.  

                OK; I think we’re done here.   It’s a big project, and this would just be the first step in a long road.    Thanks for reading. 

 

Bill James

April 25, 2012

 
 

COMMENTS (8 Comments, most recent shown first)

FabulousFriar
1) The range of similarity scores for crimes which could not possibly be related (0% by time and space) is 105 to 380. That of cases with <50% chance by time and space is 75 to 455, and that of cases at >50% is 50 to 625. If enough crimes were coded one could probably draw reasonable Bell curves of probability for any particular time/space segment.

2) If you code all the known victims of a serial killer, what would be the range of scores? That shouldn’t be very hard to do, given the number of “true crime” books that have been published on the subject.

3) In most modern violent crimes there is DNA evidence to be considered. If it is (arguably?) true that such evidence trumps all ‘circumstantial’ evidence, then what weight in these probability calculations should be given to it?

10:59 AM May 6th
 
jemanji
This type of scoring system, also used in baseball Similarity Scores, allows us to capture issues which can't be captured by the tools that pro mathemeticians are used to using.

Further, it allows the 'triangulation' of these issues, gaining more and more accuracy as time goes along. We used something similar in order to crack eBay search words... do the words "Brown" and "Silk" in the title cause better search returns than if you left the words out?

This sort of "fuzzy logic" is the only tool I know of for problems like this, and yet the odd thing is... each time I've talked with statistics pro's about it, they hate it. Don't know whether Mr. James has met with similar opposition.

The Similarity Score paradigm is a tool that attacks problems that other paradigms can't. I wonder whether its use will proliferate.
4:12 PM May 5th
 
CharlesSaeger
#4: I'm of the opinion that Borden herself is the murderer, which isn't your opinion, so I'd call it unlikely.
9:47 PM May 3rd
 
enamee
You know, it wouldn't be too hard to code all the murders in the crime book... That'd be a good starting point for the project. What would be even better is to convince some criminology grad student to do his/her thesis on this.
4:47 PM May 3rd
 
hotstatrat
It is good to see Bill's abilities applied to something with more tangable practicality than baseball or even the history of popular crimes: solving crimes.
9:18 AM May 3rd
 
bjames
1) It is obvious that Bates is NOT a Zodiac victim; thus, an evaluation premised on the proposition that she MIGHT be would be a waste of time.

2) While your ideas for ways to expand the codes are quite good ones--number of obviously related victims, time spent at the crime site--I'd be reluctant to expand the codes. It would make the coding process more laborious, and I doubt that more elements would actually help us to identify and distinguish related/unrelated murders.

3) One could add "found with other victims" to the list of "signature elements".

4) Do you think there is any possibility the Bordens are related to the Villisca/Paola murders 20 years later?
4:50 PM May 2nd
 
CharlesSaeger
Ooh! A new category: number of other victims. Not only are the Bordens linked to each other, but also to other couples killed together.
4:31 PM May 2nd
 
CharlesSaeger
One Bates oddity is that the killer spent a long time with his victim, which the Zodiac, other than the Lake Bereyessa attack, did not do; he was only with Stine a few minutes. It's often hard to see right away, but would this be a similar criterion?
7:51 AM May 2nd
 
 
©2024 Be Jolly, Inc. All Rights Reserved.|Powered by Sports Info Solutions|Terms & Conditions|Privacy Policy