Soldier Name Stats
Name Sets
There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on SOLDIER.DAT byte 67), denoted by an asterisk:
American Set
Austin Bradley
* Barbara Bryant
Calvin Carr
Carl Crossett
* Catherine Dodge
Clarence Gallagher
Donald Homburger
Dwight Horton
Ed Hudson
* Evelyn Johnson
Kevin Kemp
Lester King
Mark McNeil
Oscar Miller
* Patricia Mitchell
Samuel Nash
* Sigourney Stephens
Spencer Stoddard
Tom Thompson
Virgil Webb
British Set
Adam Bailey
Alan Blake
* Andrea Davies
Arthur Day
Brett Evans
Damien Hill
David Jones
Frank Jonlan
* Helen Martin
James Parker
* Jane Pearce
John Reynolds
* Maria Robinson
Michael Sharpe
Neil Smith
Patrick Stewart
Paul Taylor
Robert Watson
* Sarah White
Scott Wright
French Set
Armand Bouissou
Bernard Bouton
Claude Buchard
* Danielle Coicaud
Emile Collignon
Gaston Cuvelier
Gerard Dagallier
Henri Dreyfus
* Jacqueline Dujardin
Jacques Gaudin
Jean Gautier
Leon Gressier
Louis Guerin
Marc Laroyenne
Marcel Lecointe
* Marielle Lefevre
* Micheline Luget
Pierre Marcelle
Rene Pecheux
* Sylvie Revenu
German Set
* Christel Berger
Dieter Brehme
Franz Esser
Gerhard Faerber
* Gudrun Geisler
Gunter Gunkel
Hans Hafner
* Helga Heinsch
Jurgen Keller
* Karin Krause
Klaus Mederow
Manfred Meyer
Matthias Richter
Otto Schultz
Rudi Seidler
Siegfried Steinbach
Stefan Ulbricht
* Uta Unger
Werner Vogel
Wolfgang Zander
Japanese Set
Akinori Akira
Isao Fujimoto
Jungo Ishii
Kenji Iwahara
* Mariko Iwasaki
Masaharu Kojima
Masanori Koyama
* Michiko Matsumara
Naohiro Morita
* Sata Noguchi
Shigeo Okabe
Shigeru Okamoto
Shuji Sato
* Sumie Shimaoka
Tatsuo Shoji
Toshio Tanida
Yasuaki Tanikawa
Yataka Yamanaka
* Yoko Yamashita
Yuzo Yamazaki
Russian Set
Anatoly Andianov
Andrei Belov
* Astra Chukarin
Boris Gorokhova
Dmitriy Kolotov
* Galina Korkia
Gennadi Likhachev
Grigoriy Maleev
Igor Mikhailov
Ivan Petrov
Leonid Ragulin
* Lyudmila Romanov
Mikhail Samusenko
Nikolai Scharov
* Olga Shadrin
Sergei Shalimov
* Tatyana Torban
Victor Voronin
Vladimir Yakubik
Yuri Zhdanovich
Columns show first and last names for each set of 20. There is no association per se between a particular first name being next to a last name (above) - I'm simply presenting each set sorted alphabetically, and used two columns to conserve space. Any first name within a given set is liable to be combined with any last name in that set.
Test Set
20 batches of 100 recruits (total N=2,000) were used as a sample. Not all possible 2,400 first and last name combinations appeared, of course, but first and last names were always associated as shown above. Thus you may see an Adam Bailey, but will never see an Adam Bradley.
510 of 2,000 soldiers were female (25.50%), almost exactly the expected 500 (25%).
No duplicate names were observed within a given batch of 100, but numerous duplicates were observed across batches. There were 969 unique names in the 2,000, with the most-duplicated name appearing 8 times. X-COM probably uses a simple method for avoiding duplicates within a batch, such as using a random pointer into the name table (based on how many soldiers you've just recruited) and then walking through the name table (instead of repeatedly randomly sampling it). In any event, regardless of how they did it, there were no duplicates within a batch of recruits, but were duplicates across batches.
Freq Count Sum
1 496 496
2 181 362
3 131 393
4 93 372
5 40 200
6 20 120
7 7 49
8 1 8
----- ------
969 2000
Thus, 1,431 of the possible 2,400 name combinations (2400-969) did not appear.
Frequency by nationality for the 2,000:
Nationality Frequency
B1 359
A 316
F 335
G 365
J 284
R 341
It is not known why many combinations didn't show up, while others showed up multiple times. Also e.g. why there were 284 Japanese and 365 Germans, when the expected value is 333 (2000/6) for each set. Perhaps these results are due to random chance, or perhaps the name sampler has some sort of bias that makes certain combinations or nationalities more likely than others. Or maybe my 20 batches were simply not a big enough sample, particularly if the name selector does something odd when trying to avoid duplicates. For the complete dataset (including counts), see Media:X-COM Soldier Names.xls. If anyone knows how to do statistical testing for possible biases, feel free. Probably a much larger sample (10,000 recruits?) will give a clearer picture... but it would require 100 recruit batches, bleh. -MTR
Duplicates
While playing a fairly average game (less than 100 soldiers ever generated so far), I manually put stat strings on my soldiers' names. I currently have two Yoko Fujimotos with various stat strings. Considering that duplicate names aren't seen within a single game while testing names, but that I did see a second Yoko Fujimoto get generated after I changed the name of my original to Yoko Fujimoto-xs, I'm going to go ahead and say "The game probably just avoids duplicates by comparing the name it generates to each existing soldier's name". I predict that if you hire a female Russian soldier, and change her name to "Austin Bradley", you will never see another soldier generated with the name "Austin Bradley" but you might see a new soldier with her original name. This would be very tedious to test. --Sowelu 14:15, 16 September 2008 (PDT)
- Check no further, I had a look at the code and indeed it checks against every entry in SOLDIER.DAT and will regenerate a new name in case of collision (up to ten times, after that it'll give up and use the duplicate one). There is no check to see if the entry is valid so it should also remember dead soldiers as long as their entries are not overwritten. Seb76 15:22, 16 September 2008 (PDT)