This repository holds the code for the paper:
Galke, L., Ram, Y. & Raviv, L. Deep neural networks and humans both benefit from compositional language structure. Nat Commun 15, 10816 (2024). https://doi.org/10.1038/s41467-024-55158-1
@article{galkeDeepNeuralNetworks2024, title = {Deep Neural Networks and Humans Both Benefit from Compositional Language Structure}, author = {Galke, Lukas and Ram, Yoav and Raviv, Limor}, year = {2024}, journal = {Nature Communications}, volume = {15}, number = {1}, pages = {10816}, issn = {2041-1723}, doi = {10.1038/s41467-024-55158-1} }- Set up a virtual environment (e.g., via conda) with a recent python version (we used Python 3.9.5)
- Within the virtual environment, install PyTorch according to your OS, GPU availability, and Python package manager.
- Within the virtual environment, install all other requirements via
pip install -r requirements.txt
The data can be obtained via OSF and should be placed in the ./data subfolder. In particular, you need all LearningExp_*_log.txt files and the input_languages.csv file.
The main entry point is train.py. Information on command line arguments can be obtained via python3 train.py -h.
An exemplary command to run an experiment is
python3 train.py --as_humans /data/path/to/experiment.log --seed 1234 --outdir "results-v1"Use the following command to reproduce the main experiments from the paper, sweeping over all experiment log files ten times with different random seeds.
bash sweep_as_humans.bashResults will be stored in a subfolder results-v1.
The main file for running our experiments with GPT-3 is lang2prompt.py. It expects data directory to be present and filled and will write its outputs to gpt3-completions. You need to specify a language id (S1,B1,S2,...,S5,B5) as a command line argument.
An example call to run the memorization test and the generalization test on language B4 would be:
python3 lang2prompt.py B4 --gpt3-mem-test --gpt3-reg-testImportant: You need to make sure that the shell environment variable OPENAI_API_KEY holds your API key and edit the line starting with openai.organization with your corresponding organization id.
python3 lang2prompt.py B4 --gpt3-mem-test --gpt3-reg-testUse the following command to reproduce the statistical analysis.
python3 stats.py -o stats-output results-v1