Extraction and analysis of character networks from bandes dessinées, comics, mangas, and such
- Copyright 2018-2022 Vincent Labatut
NaNet is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation. For source availability and license information see licence.txt
- Lab site: http://lia.univ-avignon.fr/
- GitHub repo: https://github.com/CompNet/NaNet
- Data: https://doi.org/10.5281/zenodo.6395874
- Contact: Vincent Labatut vincent.labatut@univ-avignon.fr
If you use this source code or the associated dataset, please cite reference [L'22].
This set of R scripts aims at extracting and analyzing character networks extracted from graphic novels. It actually works on manually constituted CSV files, so in theory the work of fiction could be anything besides graphic novels, provided the input format is enforced.
The script does the following:
- Extracts various networks based on some tabular data containing individual and relational information.
- Computes a number of statistics and generates the corresponding plots.
- Performs additional analysis of the networks.
The raw dataset was manually constituted based on bande dessinée Thorgal. The output files (graphs, plots, tables...) can be obtained by running the scripts, but they are also directly available on Zenodo.
Here are the folders composing the project:
- Folder
data: contains the data used by the R scripts, as well as produced by them. Each subfolder corresponds to a different series, and has the same structure:- File
characters.csv: list of characters, see example in folderTest. - File
interactions.csv: list of scenes with the involved characters. - File
pages.csv: list of pages with their number of panels. - File
volumes.csv: list of volumes (issues) in the series. - Folder
networks: all the networks extracted from the above tables, as Graphml files and plots. - Folder
stats: CSV and plot files containing the statistics computed for the corpus and for these networks.
- File
- Folder
log: logs produced when running the scripts. - Folder
res: resources used by theRscripts. - Folder
src: contains theRsource code.
You first need to install R and the required packages:
- Install the
Rlanguage - Download this project from GitHub and unzip.
- Install the required packages:
- Open the
Rconsole. - Set the unzipped directory as the working directory, using
setwd("<my directory>"). - Run the install script
src/_install.R(that may take a while).
- Open the
A part of the analysis requires to compile some C code. The main instructions are in src/common/stats/pli/README.txt, then follow the instructions in the following files (look for the TODOs):
src/common/stats/pli/zeta.R/: concerns the files in foldersrc/common/stats/pli/zeta-function.src/common/stats/pli/powerexp.R: concerns the files in foldersrc/common/stats/pli/exponential-integral.src/common/stats/pli/discpowerexp.R: concerns the file in folder\src/common/stats/pli/discpowerexp.
In order to extract the networks from the raw data, compute the statistics, and generate the plots:
- Open the
Rconsole. - Set the current directory as the working directory, using
setwd("<my directory>"). - Run the main script
src/dev_main.R.
The scripts will produce a number of files in the subfolders of folder nets. They are grouped in subsubfolders, each one corresponding to a specific topological measure (degree, closeness, etc.).
The script src/Labatut2022.R reproduces the computations described in article [L'22]. Please, use v1.0.2 of the source code in the Releases page. Be warned that this will take a while (possibly several days). You can directly retrieve the data resulting from this process on Zenodo.
Tested with R version 4.0.5, with the following packages:
blockmodeling: version 1.0.5.CINNA: version 1.1.54.cluster: version 2.1.0.data.table: version 1.13.0.doParallel: version 1.0.16.ercv: version 1.0.1.foreach: version 1.5.0.future.apply: version 1.6.0.ggExtra: version 0.9.ggplot2: version 3.3.3.igraphpackage: version 1.2.6.latex2exp: version 0.4.0.minpack.lm: version 1.2.1.perm: version 1.0.0.2.plotfunctions: version 1.4.polynom: version 1.4.0.poweRlaw: version 0.70.6.SDMTools: version 1.1.221.sfsmisc: version 1.1.12.stringr: version 1.4.0.vioplot: version 0.3.6.viridis: version 0.6.0.
- ...
- [L'22] Labatut, V. Complex Network Analysis of a Graphic Novel: The Case of the Bande Dessinée Thorgal, Advances in Complex Systems, p.22400033, 2022. ⟨hal-03694768⟩ - DOI: 10.1142/S0219525922400033