Jump to content

Soldier Name Stats: Difference between revisions

From UFOpaedia
MikeTheRed (talk | contribs)
mNo edit summary
mNo edit summary
 
(8 intermediate revisions by 5 users not shown)
Line 3: Line 3:
There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on [[SOLDIER.DAT]] byte 67), denoted by an asterisk:
There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on [[SOLDIER.DAT]] byte 67), denoted by an asterisk:


      '''''<u>British Set One</u>
  '''''<u>American Set</u>              <u>British Set</u>               <u>French Set</u>'''''
       Adam         Bailey
  Austin    Bradley        Adam       Bailey         Armand    Bouissou
       Alan        Blake
* Barbara    Bryant        Alan       Blake         Bernard    Bouton
     * Andrea       Davies
  Calvin     Carr        * Andrea     Davies         Claude    Buchard
       Arthur       Day
  Carl      Crossett       Arthur     Day         * Danielle  Coicaud
      Brett       Evans
* Catherine  Dodge          Brett     Evans         Emile      Collignon
      Damien       Hill
  Clarence  Gallagher      Damien     Hill           Gaston    Cuvelier
      David       Jones
  Donald    Homburger      David     Jones         Gerard    Dagallier
      Frank       Jonlan
  Dwight    Horton        Frank     Jonlan         Henri      Dreyfus
    * Helen       Martin
  Ed        Hudson      * Helen     Martin       * Jacqueline Dujardin
      James       Parker
* Evelyn    Johnson        James     Parker         Jacques    Gaudin
    * Jane        Pearce
  Kevin      Kemp        * Jane       Pearce         Jean      Gautier
       John        Reynolds
  Lester    King          John       Reynolds       Leon      Gressier
    * Maria       Robinson
  Mark      McNeil      * Maria     Robinson       Louis      Guerin
      Michael     Sharpe
  Oscar      Miller        Michael   Sharpe         Marc      Laroyenne
       Neil         Smith
* Patricia  Mitchell       Neil       Smith         Marcel    Lecointe
      Patrick      Stewart
  Samuel    Nash          Patrick   Stewart     * Marielle  Lefevre
       Paul         Taylor
* Sigourney  Stephens       Paul       Taylor       * Micheline  Luget
       Robert       Watson
  Spencer    Stoddard       Robert     Watson         Pierre    Marcelle
     * Sarah       White
  Tom        Thompson     * Sarah     White         Rene      Pecheux
      Scott       Wright
  Virgil    Webb          Scott     Wright       * Sylvie    Revenu
   
   
      '''''<u>British Set Two</u>
  '''''<u>German Set</u>               <u>Japanese Set</u>             <u>Russian Set</u>'''''
      Austin      Bradley
* Christel   Berger         Akinori    Akira          Anatoly    Andianov
    * Barbara      Bryant
  Dieter    Brehme        Isao       Fujimoto       Andrei    Belov
      Calvin      Carr
  Franz     Esser          Jungo      Ishii       * Astra      Chukarin
      Carl        Crossett
  Gerhard   Faerber        Kenji     Iwahara        Boris      Gorokhova
    * Catherine    Dodge
* Gudrun     Geisler     * Mariko    Iwasaki        Dmitriy    Kolotov
      Clarence    Gallagher
  Gunter     Gunkel        Masaharu  Kojima       * Galina     Korkia
      Donald      Homburger
  Hans       Hafner         Masanori  Koyama         Gennadi   Likhachev
      Dwight      Horton
* Helga      Heinsch     * Michiko   Matsumara      Grigoriy  Maleev
      Ed          Hudson
  Jurgen     Keller         Naohiro    Morita         Igor       Mikhailov
    * Evelyn      Johnson
  * Karin     Krause       * Sata       Noguchi       Ivan       Petrov
      Kevin        Kemp
  Klaus     Mederow        Shigeo     Okabe          Leonid     Ragulin
      Lester      King
  Manfred    Meyer          Shigeru    Okamoto      * Lyudmila   Romanov
      Mark        McNeil
  Matthias  Richter        Shuji      Sato          Mikhail   Samusenko
      Oscar        Miller
  Otto      Schultz      * Sumie      Shimaoka       Nikolai   Scharov
    * Patricia    Mitchell
  Rudi      Seidler        Tatsuo     Shoji        * Olga       Shadrin
      Samuel      Nash
  Siegfried  Steinbach      Toshio    Tanida        Sergei     Shalimov
    * Sigourney    Stephens
  Stefan    Ulbricht      Yasuaki    Tanikawa     * Tatyana   Torban
      Spencer      Stoddard
* Uta        Unger          Yataka    Yamanaka       Victor     Voronin
      Tom          Thompson
  Werner    Vogel        * Yoko       Yamashita      Vladimir   Yakubik
      Virgil      Webb
  Wolfgang  Zander        Yuzo      Yamazaki       Yuri       Zhdanovich
      '''''<u>French Set</u>
      Armand      Bouissou
      Bernard      Bouton
      Claude      Buchard
    * Danielle    Coicaud
      Emile        Collignon
      Gaston      Cuvelier
      Gerard      Dagallier
      Henri        Dreyfus
    * Jacqueline  Dujardin
      Jacques      Gaudin
      Jean        Gautier
      Leon        Gressier
      Louis        Guerin
      Marc        Laroyenne
      Marcel      Lecointe
    * Marielle    Lefevre
    * Micheline    Luget
      Pierre      Marcelle
      Rene        Pecheux
    * Sylvie      Revenu
      '''''<u>German Set</u>
    * Christel     Berger
       Dieter       Brehme
      Franz        Esser
      Gerhard      Faerber
    * Gudrun       Geisler
      Gunter       Gunkel
      Hans         Hafner
    * Helga        Heinsch
       Jurgen      Keller
    * Karin        Krause
      Klaus        Mederow
      Manfred      Meyer
      Matthias     Richter
       Otto         Schultz
      Rudi         Seidler
      Siegfried   Steinbach
      Stefan      Ulbricht
    * Uta          Unger
      Werner      Vogel
      Wolfgang    Zander
      '''''<u>Japanese Set</u>
      Akinori     Akira
      Isao        Fujimoto
      Jungo        Ishii
      Kenji        Iwahara
    * Mariko      Iwasaki
      Masaharu    Kojima
      Masanori    Koyama
    * Michiko     Matsumara
      Naohiro     Morita
     * Sata         Noguchi
      Shigeo      Okabe
      Shigeru      Okamoto
      Shuji        Sato
    * Sumie        Shimaoka
      Tatsuo      Shoji
      Toshio      Tanida
      Yasuaki      Tanikawa
      Yataka      Yamanaka
    * Yoko         Yamashita
       Yuzo        Yamazaki
   
      '''''<u>Russian Set</u>
      Anatoly     Andianov
      Andrei       Belov
    * Astra        Chukarin
       Boris       Gorokhova
       Dmitriy      Kolotov
    * Galina      Korkia
      Gennadi     Likhachev
      Grigoriy     Maleev
      Igor        Mikhailov
      Ivan        Petrov
      Leonid       Ragulin
    * Lyudmila     Romanov
      Mikhail     Samusenko
       Nikolai     Scharov
     * Olga         Shadrin
      Sergei       Shalimov
     * Tatyana     Torban
       Victor       Voronin
       Vladimir     Yakubik
       Yuri         Zhdanovich


Columns show first and last names for each set of 20. There is no association per se between a particular first name being next to a last name (above) - I'm simply presenting each set sorted alphabetically, and used two columns to conserve space. Any first name within a given set is liable to be combined with any last name in that set.
Columns show first and last names for each set of 20. There is no association between a particular first name and last name (above) - first names are randomly combined with last names from the same nationality. So you may see an Austin Bradley, but you will never see an Austin Bailey. There are 2,400 possible unique names (20x20x6), a fourth of which are female.


== Test Set ==
When generating a new soldier's name, the game code checks for name duplication ten times against existing soldier names (including any deceased soldiers still in [[SOLDIER.DAT]]). This makes duplicate names infinitesimally rare. If you have the [[Hiring/firing|maximum of 250 soldiers]], 250/2400 gives a 10.42% chance each try, but raised to the 10th power this becomes 1.5x10<sup>-10</sup> (i.e., 1 in 7 billion names). If, like most folks, you have less than 250 soldiers, duplicates will be even rarer. With the 8 starting soldiers, it only happens 1 in 6x10<sup>24</sup> times. So duplicates are practically impossible. But if you change a soldier's name, even just to add an asterisk, it will no longer match game-generated ones, and might appear again.


20 batches of 100 recruits (total N=2,000) were used as a sample. Not all possible 2,400 first and last name combinations appeared, of course, but first and last names were always associated as shown above. Thus you may see an Adam Bailey, but will never see an Adam Bradley.
510 of 2,000 soldiers were female (25.50%), almost exactly the expected 500 (25%).
No duplicate names were observed within a given batch of 100, but numerous duplicates were observed across batches. There were 969 unique names in the 2,000, with the most-duplicated name appearing 8 times. X-COM probably uses a simple method for avoiding duplicates within a batch, such as using a random pointer into the name table (based on how many soldiers you've just recruited) and then walking through the name table (instead of repeatedly randomly sampling it). In any event, regardless of how they did it, there were no duplicates within a batch of recruits, but were duplicates across batches.
  <u>Freq</u>  <u>Count</u>    <u>Sum</u>
    1    496    496
    2    181    362
    3    131    393
    4      93    372
    5      40    200
    6      20    120
    7      7      49
    8      1      8
          -----  ------
          969    2000
Thus, 1,431 of the possible 2,400 name combinations (2400-969) did not appear.
Frequency by nationality for the 2,000:
<u>Nationality</u>  <u>Frequency</u>
      B1        359
      B2        316
      F          335
      G          365
      J          284
      R          341
It is not known why many combinations didn't show up, while others showed up multiple times. Also e.g. why there were 284 Japanese and 365 Germans, when the expected value is 333 (2000/6) for each set. Perhaps these results are due to random chance, or perhaps the name sampler has some sort of bias that makes certain combinations or nationalities more likely than others. Or maybe my 20 batches were simply not a big enough sample, particularly if the name selector does something odd when trying to avoid duplicates. For the complete dataset (including counts), see [[Media:X-COM Soldier Names.xls]]. If anyone knows how to do statistical testing for possible biases, feel free. Probably a much larger sample (10,000 recruits?) will give a clearer picture... but it would require 100 recruit batches, bleh. ''-[[User:MikeTheRed|MTR]]


==See Also==
==See Also==
*[[Raw_recruit_statistical_likelihood|Recruit Statistics]]
*[[Raw_recruit_statistical_likelihood|Recruit Statistics]]
*[[Soldiers (UFO Defense)|Soldiers]]
*[[Soldiers]]
*[[Hiring/firing]]
*[[Hiring/firing]]
[[Category:Enemy Unknown/UFO Defense]]

Latest revision as of 20:52, 8 May 2013

Name Sets

There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on SOLDIER.DAT byte 67), denoted by an asterisk:

  American Set              British Set               French Set
  Austin     Bradley        Adam       Bailey         Armand     Bouissou
* Barbara    Bryant         Alan       Blake          Bernard    Bouton
  Calvin     Carr         * Andrea     Davies         Claude     Buchard
  Carl       Crossett       Arthur     Day          * Danielle   Coicaud
* Catherine  Dodge          Brett      Evans          Emile      Collignon
  Clarence   Gallagher      Damien     Hill           Gaston     Cuvelier
  Donald     Homburger      David      Jones          Gerard     Dagallier
  Dwight     Horton         Frank      Jonlan         Henri      Dreyfus
  Ed         Hudson       * Helen      Martin       * Jacqueline Dujardin
* Evelyn     Johnson        James      Parker         Jacques    Gaudin
  Kevin      Kemp         * Jane       Pearce         Jean       Gautier
  Lester     King           John       Reynolds       Leon       Gressier
  Mark       McNeil       * Maria      Robinson       Louis      Guerin
  Oscar      Miller         Michael    Sharpe         Marc       Laroyenne
* Patricia   Mitchell       Neil       Smith          Marcel     Lecointe
  Samuel     Nash           Patrick    Stewart      * Marielle   Lefevre
* Sigourney  Stephens       Paul       Taylor       * Micheline  Luget
  Spencer    Stoddard       Robert     Watson         Pierre     Marcelle
  Tom        Thompson     * Sarah      White          Rene       Pecheux
  Virgil     Webb           Scott      Wright       * Sylvie     Revenu

  German Set                Japanese Set              Russian Set
* Christel   Berger         Akinori    Akira          Anatoly    Andianov
  Dieter     Brehme         Isao       Fujimoto       Andrei     Belov
  Franz      Esser          Jungo      Ishii        * Astra      Chukarin
  Gerhard    Faerber        Kenji      Iwahara        Boris      Gorokhova
* Gudrun     Geisler      * Mariko     Iwasaki        Dmitriy    Kolotov
  Gunter     Gunkel         Masaharu   Kojima       * Galina     Korkia
  Hans       Hafner         Masanori   Koyama         Gennadi    Likhachev
* Helga      Heinsch      * Michiko    Matsumara      Grigoriy   Maleev
  Jurgen     Keller         Naohiro    Morita         Igor       Mikhailov
* Karin      Krause       * Sata       Noguchi        Ivan       Petrov
  Klaus      Mederow        Shigeo     Okabe          Leonid     Ragulin
  Manfred    Meyer          Shigeru    Okamoto      * Lyudmila   Romanov
  Matthias   Richter        Shuji      Sato           Mikhail    Samusenko
  Otto       Schultz      * Sumie      Shimaoka       Nikolai    Scharov
  Rudi       Seidler        Tatsuo     Shoji        * Olga       Shadrin
  Siegfried  Steinbach      Toshio     Tanida         Sergei     Shalimov
  Stefan     Ulbricht       Yasuaki    Tanikawa     * Tatyana    Torban
* Uta        Unger          Yataka     Yamanaka       Victor     Voronin
  Werner     Vogel        * Yoko       Yamashita      Vladimir   Yakubik
  Wolfgang   Zander         Yuzo       Yamazaki       Yuri       Zhdanovich

Columns show first and last names for each set of 20. There is no association between a particular first name and last name (above) - first names are randomly combined with last names from the same nationality. So you may see an Austin Bradley, but you will never see an Austin Bailey. There are 2,400 possible unique names (20x20x6), a fourth of which are female.

When generating a new soldier's name, the game code checks for name duplication ten times against existing soldier names (including any deceased soldiers still in SOLDIER.DAT). This makes duplicate names infinitesimally rare. If you have the maximum of 250 soldiers, 250/2400 gives a 10.42% chance each try, but raised to the 10th power this becomes 1.5x10-10 (i.e., 1 in 7 billion names). If, like most folks, you have less than 250 soldiers, duplicates will be even rarer. With the 8 starting soldiers, it only happens 1 in 6x1024 times. So duplicates are practically impossible. But if you change a soldier's name, even just to add an asterisk, it will no longer match game-generated ones, and might appear again.


See Also