2025-05-03 17:44:37 +02:00
# Discord Emojis
A Python tool that automatically fetches and extracts the latest emoji data from Discord.
## Overview
2025-05-03 17:57:06 +02:00
This project automatically downloads the latest Discord build from the [Discord-Datamining ](https://github.com/Discord-Datamining/Discord-Datamining ) repository, extracts emoji data, and saves it in a structured JSON format. It runs as a GitHub Actions workflow twice a week and opens a pull-request to keep the emoji data up-to-date, without manual intervention.
2025-05-03 17:44:37 +02:00
## How It Works
- GitHub Actions workflow runs automatically twice per week
- Downloads the latest Discord build
- Extracts emoji information
- Saves data in a standardized JSON format
- Tracks changes using hash comparison to avoid unnecessary updates
2025-05-03 17:57:06 +02:00
- Detects and reports unhandled UTF-16 surrogate pairs
2025-05-03 17:44:37 +02:00
## Technical Details
The project uses:
- Python 3.13+
- Dependencies:
- json5
- requests
## Output
The emoji data is saved in `build/emojis.json` in the following format:
2025-05-03 17:57:06 +02:00
2025-05-03 18:15:23 +02:00
```json5
2025-05-03 17:44:37 +02:00
{
"emojis": [
{
2025-05-03 17:57:06 +02:00
"names": [
"grinning",
"grinning_face"
],
"surrogates": "😀",
"unicodeVersion": 6.1,
"spriteIndex": 0
2025-05-03 17:44:37 +02:00
},
2025-05-03 17:57:06 +02:00
// More emoji entries...
],
"emojisByCategory": {
"people": [
0,
509
],
// More categories...
},
"nameToEmoji": {
"100": 1410,
"1234": 1488,
"grinning": 0,
// More name mappings...
},
"surrogateToEmoji": {
"😀": 0,
"😃": 1,
"😄": 2,
// More surrogate mappings...
},
"numDiversitySprites": 310,
"numNonDiversitySprites": 1614
2025-05-03 17:44:37 +02:00
}
```
2025-05-03 17:57:06 +02:00
### Format Explanation
- **emojis**: Array of emoji objects containing:
- **names**: Array of names/aliases for the emoji
- **surrogates**: Unicode representation of the emoji
- **unicodeVersion**: Version where the emoji was introduced
- **spriteIndex**: Index in Discord's sprite sheet
- **emojisByCategory**: Object mapping category names to arrays of starting and ending indices in the emoji array
- **nameToEmoji**: Mapping of emoji names to their index in the emoji array (used for quick lookups)
- **surrogateToEmoji**: Mapping of emoji unicode characters to their index in the emoji array (used for quick lookups)
- **numDiversitySprites**: Number of skin tone modifier sprites available (e.g., different skin tones for hand gestures)
- **numNonDiversitySprites**: Number of standard emoji sprites that don't have skin tone modifiers
## Easy Access
The easiest way to access the emojis data is via the direct raw GitHub URL:
```
https://raw.githubusercontent.com/Paillat-dev/discord-emojis/refs/heads/master/build/emojis.json
```
2025-05-03 17:44:37 +02:00
## Development
For local development or testing:
```bash
# Install dependencies using uv
uv sync --dev
# Run the update script manually
uv run src
```
Note: Local execution should only be done during development. In production, the script runs automatically through GitHub Actions.
The script will:
1. Download the latest Discord build
2. Extract emoji information
3. Save the data to `build/emojis.json`
4. Generate a hash file to track changes
If no changes are detected compared to the previous run, the script will exit with status code 3.