Jump to content

Soldier Name Stats: Difference between revisions

From UFOpaedia
Sowelu (talk | contribs)
Hypothesis on duplicate name checking
mNo edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 3: Line 3:
There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on [[SOLDIER.DAT]] byte 67), denoted by an asterisk:
There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on [[SOLDIER.DAT]] byte 67), denoted by an asterisk:


      '''''<u>American Set</u>
  '''''<u>American Set</u>             <u>British Set</u>              <u>French Set</u>'''''
      Austin      Bradley
  Austin     Bradley        Adam       Bailey        Armand    Bouissou
    * Barbara     Bryant
* Barbara   Bryant         Alan      Blake          Bernard    Bouton
      Calvin       Carr
  Calvin     Carr         * Andrea    Davies        Claude    Buchard
       Carl        Crossett
  Carl       Crossett       Arthur    Day          * Danielle  Coicaud
    * Catherine   Dodge
* Catherine Dodge         Brett      Evans          Emile      Collignon
      Clarence    Gallagher
  Clarence   Gallagher      Damien     Hill          Gaston    Cuvelier
      Donald       Homburger
  Donald     Homburger     David      Jones          Gerard    Dagallier
      Dwight       Horton
  Dwight     Horton         Frank      Jonlan        Henri      Dreyfus
      Ed           Hudson
  Ed         Hudson       * Helen      Martin      * Jacqueline Dujardin
    * Evelyn       Johnson
* Evelyn     Johnson       James      Parker        Jacques    Gaudin
      Kevin       Kemp
  Kevin     Kemp         * Jane      Pearce        Jean      Gautier
       Lester       King
  Lester    King          John       Reynolds      Leon       Gressier
       Mark        McNeil
  Mark       McNeil       * Maria      Robinson      Louis      Guerin
      Oscar       Miller
  Oscar     Miller         Michael    Sharpe        Marc      Laroyenne
    * Patricia    Mitchell
* Patricia   Mitchell      Neil      Smith          Marcel     Lecointe
      Samuel       Nash
  Samuel     Nash           Patrick    Stewart      * Marielle  Lefevre
    * Sigourney   Stephens
* Sigourney Stephens       Paul      Taylor      * Micheline  Luget
      Spencer     Stoddard
  Spencer   Stoddard       Robert    Watson        Pierre    Marcelle
      Tom          Thompson
  Tom       Thompson    * Sarah      White         Rene      Pecheux
      Virgil      Webb
  Virgil     Webb          Scott      Wright       * Sylvie    Revenu
   
   
      '''''<u>British Set</u>
  '''''<u>German Set</u>               <u>Japanese Set</u>             <u>Russian Set</u>'''''
      Adam        Bailey
* Christel   Berger         Akinori    Akira          Anatoly    Andianov
      Alan        Blake
  Dieter    Brehme        Isao       Fujimoto       Andrei    Belov
    * Andrea      Davies
  Franz     Esser          Jungo      Ishii       * Astra      Chukarin
      Arthur      Day
  Gerhard   Faerber        Kenji     Iwahara        Boris      Gorokhova
      Brett        Evans
* Gudrun     Geisler     * Mariko    Iwasaki        Dmitriy    Kolotov
      Damien      Hill
  Gunter     Gunkel        Masaharu  Kojima       * Galina     Korkia
      David        Jones
  Hans       Hafner         Masanori  Koyama         Gennadi   Likhachev
      Frank        Jonlan
* Helga      Heinsch     * Michiko   Matsumara      Grigoriy  Maleev
    * Helen        Martin
  Jurgen     Keller         Naohiro    Morita         Igor       Mikhailov
      James        Parker
  * Karin     Krause       * Sata       Noguchi       Ivan       Petrov
    * Jane        Pearce
  Klaus     Mederow        Shigeo     Okabe          Leonid     Ragulin
      John        Reynolds
  Manfred    Meyer          Shigeru    Okamoto      * Lyudmila   Romanov
    * Maria        Robinson
  Matthias  Richter        Shuji      Sato          Mikhail   Samusenko
      Michael      Sharpe
  Otto      Schultz      * Sumie      Shimaoka       Nikolai   Scharov
      Neil        Smith
  Rudi      Seidler        Tatsuo     Shoji        * Olga       Shadrin
      Patrick      Stewart
  Siegfried  Steinbach      Toshio    Tanida        Sergei     Shalimov
      Paul        Taylor
  Stefan    Ulbricht      Yasuaki    Tanikawa     * Tatyana   Torban
      Robert      Watson
* Uta        Unger          Yataka    Yamanaka       Victor     Voronin
    * Sarah        White
  Werner    Vogel        * Yoko       Yamashita      Vladimir   Yakubik
      Scott        Wright
  Wolfgang  Zander        Yuzo      Yamazaki       Yuri       Zhdanovich
      '''''<u>French Set</u>
      Armand      Bouissou
      Bernard      Bouton
      Claude      Buchard
    * Danielle    Coicaud
      Emile        Collignon
      Gaston      Cuvelier
      Gerard      Dagallier
      Henri        Dreyfus
    * Jacqueline  Dujardin
      Jacques      Gaudin
      Jean        Gautier
      Leon        Gressier
      Louis        Guerin
      Marc        Laroyenne
      Marcel      Lecointe
    * Marielle    Lefevre
    * Micheline    Luget
      Pierre      Marcelle
      Rene        Pecheux
    * Sylvie      Revenu
      '''''<u>German Set</u>
    * Christel     Berger
       Dieter       Brehme
      Franz        Esser
      Gerhard      Faerber
    * Gudrun       Geisler
      Gunter       Gunkel
      Hans         Hafner
    * Helga        Heinsch
       Jurgen      Keller
    * Karin        Krause
      Klaus        Mederow
      Manfred      Meyer
      Matthias     Richter
       Otto         Schultz
      Rudi         Seidler
      Siegfried   Steinbach
      Stefan      Ulbricht
    * Uta          Unger
      Werner      Vogel
      Wolfgang    Zander
      '''''<u>Japanese Set</u>
      Akinori     Akira
      Isao        Fujimoto
      Jungo        Ishii
      Kenji        Iwahara
    * Mariko      Iwasaki
      Masaharu    Kojima
      Masanori    Koyama
    * Michiko     Matsumara
      Naohiro     Morita
     * Sata         Noguchi
      Shigeo      Okabe
      Shigeru      Okamoto
      Shuji        Sato
    * Sumie        Shimaoka
      Tatsuo      Shoji
      Toshio      Tanida
      Yasuaki      Tanikawa
      Yataka      Yamanaka
    * Yoko         Yamashita
       Yuzo        Yamazaki
   
      '''''<u>Russian Set</u>
      Anatoly     Andianov
      Andrei       Belov
    * Astra        Chukarin
       Boris       Gorokhova
       Dmitriy      Kolotov
    * Galina      Korkia
      Gennadi     Likhachev
      Grigoriy     Maleev
      Igor        Mikhailov
      Ivan        Petrov
      Leonid       Ragulin
    * Lyudmila     Romanov
      Mikhail     Samusenko
       Nikolai     Scharov
     * Olga         Shadrin
      Sergei       Shalimov
     * Tatyana     Torban
       Victor       Voronin
       Vladimir     Yakubik
       Yuri         Zhdanovich


Columns show first and last names for each set of 20. There is no association per se between a particular first name being next to a last name (above) - I'm simply presenting each set sorted alphabetically, and used two columns to conserve space. Any first name within a given set is liable to be combined with any last name in that set.
Columns show first and last names for each set of 20. There is no association between a particular first name and last name (above) - first names are randomly combined with last names from the same nationality. So you may see an Austin Bradley, but you will never see an Austin Bailey. There are 2,400 possible unique names (20x20x6), a fourth of which are female.


== Test Set ==
When generating a new soldier's name, the game code checks for name duplication ten times against existing soldier names (including any deceased soldiers still in [[SOLDIER.DAT]]). This makes duplicate names infinitesimally rare. If you have the [[Hiring/firing|maximum of 250 soldiers]], 250/2400 gives a 10.42% chance each try, but raised to the 10th power this becomes 1.5x10<sup>-10</sup> (i.e., 1 in 7 billion names). If, like most folks, you have less than 250 soldiers, duplicates will be even rarer. With the 8 starting soldiers, it only happens 1 in 6x10<sup>24</sup> times. So duplicates are practically impossible. But if you change a soldier's name, even just to add an asterisk, it will no longer match game-generated ones, and might appear again.


20 batches of 100 recruits (total N=2,000) were used as a sample. Not all possible 2,400 first and last name combinations appeared, of course, but first and last names were always associated as shown above. Thus you may see an Adam Bailey, but will never see an Adam Bradley.
510 of 2,000 soldiers were female (25.50%), almost exactly the expected 500 (25%).
No duplicate names were observed within a given batch of 100, but numerous duplicates were observed across batches. There were 969 unique names in the 2,000, with the most-duplicated name appearing 8 times. X-COM probably uses a simple method for avoiding duplicates within a batch, such as using a random pointer into the name table (based on how many soldiers you've just recruited) and then walking through the name table (instead of repeatedly randomly sampling it). In any event, regardless of how they did it, there were no duplicates within a batch of recruits, but were duplicates across batches.
  <u>Freq</u>  <u>Count</u>    <u>Sum</u>
    1    496    496
    2    181    362
    3    131    393
    4      93    372
    5      40    200
    6      20    120
    7      7      49
    8      1      8
          -----  ------
          969    2000
Thus, 1,431 of the possible 2,400 name combinations (2400-969) did not appear.
Frequency by nationality for the 2,000:
<u>Nationality</u>  <u>Frequency</u>
      B1        359
      A          316
      F          335
      G          365
      J          284
      R          341
It is not known why many combinations didn't show up, while others showed up multiple times. Also e.g. why there were 284 Japanese and 365 Germans, when the expected value is 333 (2000/6) for each set. Perhaps these results are due to random chance, or perhaps the name sampler has some sort of bias that makes certain combinations or nationalities more likely than others. Or maybe my 20 batches were simply not a big enough sample, particularly if the name selector does something odd when trying to avoid duplicates. For the complete dataset (including counts), see [[Media:X-COM Soldier Names.xls]]. If anyone knows how to do statistical testing for possible biases, feel free. Probably a much larger sample (10,000 recruits?) will give a clearer picture... but it would require 100 recruit batches, bleh. ''-[[User:MikeTheRed|MTR]]
== Duplicates ==
While playing a fairly average game (less than 100 soldiers ever generated so far), I manually put stat strings on my soldiers' names.  I currently have two Yoko Fujimotos with various stat strings.  Considering that duplicate names aren't seen within a single game while testing names, but that I did see a second Yoko Fujimoto get generated after I changed the name of my original to Yoko Fujimoto-xs, I'm going to go ahead and say "The game probably just avoids duplicates by comparing the name it generates to each existing soldier's name".  I predict that if you hire a female Russian soldier, and change her name to "Austin Bradley", you will never see another soldier generated with the name "Austin Bradley" but you might see a new soldier with her original name.  This would be very tedious to test.  --[[User:Sowelu|Sowelu]] 14:15, 16 September 2008 (PDT)


==See Also==
==See Also==
Line 178: Line 56:
*[[Soldiers]]
*[[Soldiers]]
*[[Hiring/firing]]
*[[Hiring/firing]]
[[Category:Enemy Unknown/UFO Defense]]

Latest revision as of 20:52, 8 May 2013

Name Sets

There are six sets of X-COM soldier names, each composed of 20 first names and 20 last names. 5 of the 20 first names in each set are female (based on SOLDIER.DAT byte 67), denoted by an asterisk:

  American Set              British Set               French Set
  Austin     Bradley        Adam       Bailey         Armand     Bouissou
* Barbara    Bryant         Alan       Blake          Bernard    Bouton
  Calvin     Carr         * Andrea     Davies         Claude     Buchard
  Carl       Crossett       Arthur     Day          * Danielle   Coicaud
* Catherine  Dodge          Brett      Evans          Emile      Collignon
  Clarence   Gallagher      Damien     Hill           Gaston     Cuvelier
  Donald     Homburger      David      Jones          Gerard     Dagallier
  Dwight     Horton         Frank      Jonlan         Henri      Dreyfus
  Ed         Hudson       * Helen      Martin       * Jacqueline Dujardin
* Evelyn     Johnson        James      Parker         Jacques    Gaudin
  Kevin      Kemp         * Jane       Pearce         Jean       Gautier
  Lester     King           John       Reynolds       Leon       Gressier
  Mark       McNeil       * Maria      Robinson       Louis      Guerin
  Oscar      Miller         Michael    Sharpe         Marc       Laroyenne
* Patricia   Mitchell       Neil       Smith          Marcel     Lecointe
  Samuel     Nash           Patrick    Stewart      * Marielle   Lefevre
* Sigourney  Stephens       Paul       Taylor       * Micheline  Luget
  Spencer    Stoddard       Robert     Watson         Pierre     Marcelle
  Tom        Thompson     * Sarah      White          Rene       Pecheux
  Virgil     Webb           Scott      Wright       * Sylvie     Revenu

  German Set                Japanese Set              Russian Set
* Christel   Berger         Akinori    Akira          Anatoly    Andianov
  Dieter     Brehme         Isao       Fujimoto       Andrei     Belov
  Franz      Esser          Jungo      Ishii        * Astra      Chukarin
  Gerhard    Faerber        Kenji      Iwahara        Boris      Gorokhova
* Gudrun     Geisler      * Mariko     Iwasaki        Dmitriy    Kolotov
  Gunter     Gunkel         Masaharu   Kojima       * Galina     Korkia
  Hans       Hafner         Masanori   Koyama         Gennadi    Likhachev
* Helga      Heinsch      * Michiko    Matsumara      Grigoriy   Maleev
  Jurgen     Keller         Naohiro    Morita         Igor       Mikhailov
* Karin      Krause       * Sata       Noguchi        Ivan       Petrov
  Klaus      Mederow        Shigeo     Okabe          Leonid     Ragulin
  Manfred    Meyer          Shigeru    Okamoto      * Lyudmila   Romanov
  Matthias   Richter        Shuji      Sato           Mikhail    Samusenko
  Otto       Schultz      * Sumie      Shimaoka       Nikolai    Scharov
  Rudi       Seidler        Tatsuo     Shoji        * Olga       Shadrin
  Siegfried  Steinbach      Toshio     Tanida         Sergei     Shalimov
  Stefan     Ulbricht       Yasuaki    Tanikawa     * Tatyana    Torban
* Uta        Unger          Yataka     Yamanaka       Victor     Voronin
  Werner     Vogel        * Yoko       Yamashita      Vladimir   Yakubik
  Wolfgang   Zander         Yuzo       Yamazaki       Yuri       Zhdanovich

Columns show first and last names for each set of 20. There is no association between a particular first name and last name (above) - first names are randomly combined with last names from the same nationality. So you may see an Austin Bradley, but you will never see an Austin Bailey. There are 2,400 possible unique names (20x20x6), a fourth of which are female.

When generating a new soldier's name, the game code checks for name duplication ten times against existing soldier names (including any deceased soldiers still in SOLDIER.DAT). This makes duplicate names infinitesimally rare. If you have the maximum of 250 soldiers, 250/2400 gives a 10.42% chance each try, but raised to the 10th power this becomes 1.5x10-10 (i.e., 1 in 7 billion names). If, like most folks, you have less than 250 soldiers, duplicates will be even rarer. With the 8 starting soldiers, it only happens 1 in 6x1024 times. So duplicates are practically impossible. But if you change a soldier's name, even just to add an asterisk, it will no longer match game-generated ones, and might appear again.


See Also