|
1 | 1 | # NotebookLM Detector
|
2 | 2 |
|
3 |
| -This is a simple tool to detect if an audio file is generated by NotebookLM. |
| 3 | +A simple tool to detect whether an audio file was generated by [NotebookLM](https://notebooklm.google/). |
4 | 4 |
|
5 |
| -## Detect |
| 5 | +At [Listen Notes](https://www.listennotes.com/), we've encountered a growing number of spammers submitting fake, |
| 6 | +NotebookLM-generated podcasts to our platform. |
6 | 7 |
|
7 |
| -Install dependencies first: |
| 8 | +We hoped the NotebookLM team would provide a tool to help detect NotebookLM-generated audio. |
| 9 | +However, after a week of back-and-forth emails, we lost patience. |
| 10 | + |
| 11 | +It's now Friday (Oct 4, 2024), and since we won't hear back from the NotebookLM team until next week, |
| 12 | +we decided to put together this simple script. Luckily, it seems to work! |
| 13 | + |
| 14 | + |
| 15 | +## Detection |
| 16 | + |
| 17 | +### Install Dependencies |
8 | 18 |
|
9 | 19 | ```shell
|
10 | 20 | $ pip install -r requirements.txt
|
11 | 21 | ```
|
12 | 22 |
|
13 |
| -Run the script to detect: |
| 23 | +### Run the Detection Script |
| 24 | + |
| 25 | +To detect whether an audio file is AI-generated or human-produced, run the following command: |
14 | 26 | ```shell
|
15 | 27 | $ python notebooklm_detector.py --action predict --file_path [filename].mp3
|
16 | 28 | ```
|
17 | 29 |
|
18 |
| -## Train |
| 30 | +You’ll see output like this: |
| 31 | +```shell |
| 32 | +$ The audio is: AI Generated |
| 33 | +``` |
| 34 | +or |
| 35 | +```shell |
| 36 | +$ The audio is: Human |
| 37 | +``` |
| 38 | + |
| 39 | +## Training the Model |
| 40 | + |
| 41 | +You can train the model and regenerate `model.pkl` by following these steps: |
| 42 | + |
| 43 | +### Step 1: Organize the Dataset |
19 | 44 |
|
20 |
| -You can train and regenerate model.pkl: |
| 45 | +* Place NotebookLM-generated audio files (mp3, wav, or mp4) in the datasets/ai/ folder. |
| 46 | +* Place human-produced audio files in the datasets/human/ folder. |
21 | 47 |
|
22 |
| -Step 1: Put NotebookLM-generated audio files (mp3, wav, or mp4) in datasets/ai/ folder. |
23 |
| -And put human-produced audio files in datasets/human/ folder. |
| 48 | +### Step 2: Run the Training Script |
24 | 49 |
|
25 |
| -Step 2: Run the script to train: |
| 50 | +To train the model, run: |
26 | 51 | ```shell
|
27 | 52 | $ python notebooklm_detector.py --action train --dataset_path datasets
|
28 | 53 | ```
|
0 commit comments