Skip to content
/ GenUP Public

GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems (Wongso et al., 2024)

License

Notifications You must be signed in to change notification settings

w11wo/GenUP

Repository files navigation

GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems

1 School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
2 ARC Centre of Excellence for Automated Decision Making + Society

huggingface arXiv python pytorch

Overview

GenUP

Download Datasets

LLM4POI Datasets

We followed the dataset preparation of LLM4POI for the FourSquare-NYC, Gowalla-CA, and FourSquare-TKY datasets. We also provide the processed datasets on Hugging Face Please refer to their repository for more details.

❗️ Moscow and Sao Paulo preprocessing steps will be made available soon.

Processed Datasets

We provide the processed, Q&A version of LLM4POI, Moscow and Sao Paulo datasets:

Dataset Link
FourSquare-NYC Hugging Face
Gowalla-CA Hugging Face
FourSquare-TKY Hugging Face
FourSquare-Moscow Hugging Face
FourSquare-SaoPaulo Hugging Face

(Optional) Prepare Datasets

Generate User Profiles

To generate the user profiles from the check-in records, run the following commands:

⚠️ Make sure to replace w11wo with your Hugging Face username.

python src/generate_user_profile.py --dataset nyc --dataset_id w11wo/FourSquare-NYC-User-Profiles
python src/generate_user_profile.py --dataset ca --dataset_id w11wo/Gowalla-CA-User-Profiles
python src/generate_user_profile.py --dataset tky --dataset_id w11wo/FourSquare-TKY-User-Profiles

Create SFT Dataset

And to create the SFT dataset using the user profiles and the POI data, run the following commands:

python src/create_sft_dataset.py --dataset nyc --dataset_id w11wo/FourSquare-NYC-POI
python src/create_sft_dataset.py --dataset ca --dataset_id w11wo/Gowalla-CA-POI
python src/create_sft_dataset.py --dataset tky --dataset_id w11wo/FourSquare-TKY-POI

Supervised Fine-tuning

We provide the training scripts and recipes for the GenUP-Llama models demonstrated in our paper. In our setup, we used 2 H100 GPUs, QLoRA and FSDP for multi-GPU training, and the following hyperparameters:

Llama-2-7B-LongLoRA-32k

FourSquare-NYC-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "w11wo/Llama-2-7b-longlora-32k-merged" \
    --max_length 16384 \
    --batch_size 1 \
    --learning_rate 2e-5 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-NYC-POI"
Gowalla-CA-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "w11wo/Llama-2-7b-longlora-32k-merged" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-5 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/Gowalla-CA-POI"
FourSquare-TKY-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "w11wo/Llama-2-7b-longlora-32k-merged" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-5 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-TKY-POI"
FourSquare-Moscow-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "w11wo/Llama-2-7b-longlora-32k-merged" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-5 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-Moscow-POI"
FourSquare-SaoPaulo-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "w11wo/Llama-2-7b-longlora-32k-merged" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-5 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-SaoPaulo-POI"

Llama-3.1-8B

FourSquare-NYC-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.1-8B" \
    --max_length 16384 \
    --batch_size 1 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-NYC-POI"
Gowalla-CA-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.1-8B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/Gowalla-CA-POI"
FourSquare-TKY-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.1-8B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-TKY-POI"
FourSquare-Moscow-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.1-8B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-Moscow-POI"
FourSquare-SaoPaulo-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.1-8B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-SaoPaulo-POI"

Llama-3.2-1B

FourSquare-NYC-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.2-1B" \
    --max_length 16384 \
    --batch_size 1 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-NYC-POI"
Gowalla-CA-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.2-1B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/Gowalla-CA-POI"
FourSquare-TKY-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.2-1B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-TKY-POI"
FourSquare-Moscow-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.2-1B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-Moscow-POI"
FourSquare-SaoPaulo-POI
ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=2 src/train_sft_qlora_fsdp.py \
    --model_checkpoint "meta-llama/Meta-Llama-3.2-1B" \
    --max_length 16384 \
    --batch_size 2 \
    --learning_rate 2e-4 \
    --max_grad_norm 1.0 \
    --warmup_steps 20 \
    --num_epochs 3 \
    --gradient_checkpointing \
    --apply_liger_kernel_to_llama \
    --dataset_id "w11wo/FourSquare-SaoPaulo-POI"

Next POI Evaluation

To predict the next POI for a given trajectory, we used the fine-tuned models to generate the ID of the next POI. Different models and datasets require different generation parameters. The following commands show how to run the evaluation script for each model:

Llama-2-7B-LongLoRA-32k on FourSquare-NYC-POI
accelerate launch src/eval_next_poi.py \
    --model_checkpoint "w11wo/Llama-2-7b-longlora-32k-merged-FourSquare-NYC-POI" \
    --dataset_id "w11wo/FourSquare-NYC-POI" \
    --apply_liger_kernel_to_llama
Llama-3.1-8B on FourSquare-TKY-POI
accelerate launch src/eval_next_poi.py \
    --model_checkpoint "w11wo/Meta-Llama-3.1-8B-FourSquare-TKY-POI" \
    --dataset_id "w11wo/FourSquare-TKY-POI" \
    --apply_liger_kernel_to_llama \
    --temperature 0.65 \
    --top_k 50 \
    --top_p 0.92 \
    --typical_p 0.95 \
    --repetition_penalty 1.0
Llama-3.2-1B on FourSquare-TKY-POI
accelerate launch src/eval_next_poi.py \
    --model_checkpoint "w11wo/Meta-Llama-3.2-1B-FourSquare-TKY-POI" \
    --dataset_id "w11wo/Gowalla-CA-POI" \
    --apply_liger_kernel_to_llama \
    --temperature 0.65 \
    --top_k 50 \
    --top_p 0.92 \
    --typical_p 0.95 \
    --repetition_penalty 1.0
Llama-3.1-8B on FourSquare-Moscow-POI
accelerate launch src/eval_next_poi.py \
    --model_checkpoint "w11wo/Llama-3.1-8B-FourSquare-Moscow-POI" \
    --dataset_id "w11wo/FourSquare-Moscow-POI" \
    --apply_liger_kernel_to_llama \
    --temperature 0.6 \
    --top_k 50 \
    --top_p 0.1 \
    --typical_p 0.95 \
    --repetition_penalty 1.0
Llama-3.2-1B on FourSquare-SaoPaulo-POI
accelerate launch src/eval_next_poi.py \
    --model_checkpoint "w11wo/Meta-Llama-3.2-1B-FourSquare-SaoPaulo-POI" \
    --dataset_id "w11wo/FourSquare-SaoPaulo-POI" \
    --apply_liger_kernel_to_llama \
    --temperature 0.6 \
    --top_k 50 \
    --top_p 0.1 \
    --typical_p 0.95 \
    --repetition_penalty 1.0

POI Prediction Results

Model History Other Users NYC TKY CA
Without historical and intra-user social data
LLM4POI* × × 0.2356 0.1517 0.1016
GenUP-Llama2-7b × × 0.2575 0.1699 0.1094
GenUP-Llama3.1-8b × × 0.2582 0.2127 0.1339
GenUP-Llama3.2-1b × × 0.2484 0.1851 0.1267
With historical and intra-user social data
GETNext 0.2435 0.2254 0.1357
STHGCN 0.2734 0.2950 0.1730
LLM4POI 0.3372 0.3035 0.2065
Model Moscow Sao Paulo
Supervised fine-tuning
LLM4POI* 0.146 0.166
GenUP-Llama2-7b 0.159 0.175
GenUP-Llama3.1-8b 0.163 0.178
GenUP-Llama3.2-1b 0.161 0.175
In-context learning
LLM-Mob 0.080 0.140
LLM-ZS 0.120 0.165
LLM agents and with external knowledge
AgentMove 0.160 0.230

User Cold-start Analysis

To analyze the performance of the models on different user groups (e.g. cold-start users), run the following command:

python src/user_cold_start_analysis.py \
    --model_checkpoint w11wo/Llama-2-7b-longlora-32k-merged-FourSquare-NYC-POI \
    --dataset_id w11wo/FourSquare-NYC-POI

Results

User Groups Model NYC TKY CA Moscow Sao Paulo
Inactive GenUP-Llama2-7b 0.2105 0.1306 0.1091 0.1227 0.1366
Normal GenUP-Llama2-7b 0.2591 0.1394 0.1089 0.1410 0.1504
Very Active GenUP-Llama2-7b 0.2752 0.2063 0.1096 0.1748 0.1940
Inactive GenUP-Llama3.1-8b 0.1826 0.1486 0.1380 0.1180 0.1393
Normal GenUP-Llama3.1-8b 0.2554 0.1695 0.1338 0.1464 0.1598
Very Active GenUP-Llama3.1-8b 0.2884 0.2688 0.1324 0.1808 0.1944
Inactive GenUP-Llama3.2-1b 0.1764 0.1306 0.1316 0.1210 0.1429
Normal GenUP-Llama3.2-1b 0.2664 0.1494 0.1223 0.1390 0.1530
Very Active GenUP-Llama3.2-1b 0.2704 0.2321 0.1263 0.1793 0.1906

Trajectory Length Analysis

And to analyze the performance of the models on different trajectory lengths, run the following command:

python src/trajectory_length_analysis.py \
    --model_checkpoint w11wo/Llama-2-7b-longlora-32k-merged-FourSquare-NYC-POI \
    --dataset_id w11wo/FourSquare-NYC-POI

Results

Trajectory Length Model NYC TKY CA Moscow Sao Paulo
Short GenUP-Llama2-7b 0.1980 0.1138 0.0649 0.0646 0.0706
Middle GenUP-Llama2-7b 0.2801 0.1693 0.1154 0.1873 0.1985
Long GenUP-Llama2-7b 0.3099 0.2264 0.1578 0.2494 0.2739
Short GenUP-Llama3.1-8b 0.2146 0.1717 0.1070 0.0744 0.0759
Middle GenUP-Llama3.1-8b 0.2529 0.2014 0.1367 0.1899 0.2009
Long GenUP-Llama3.1-8b 0.3064 0.2636 0.1637 0.2490 0.2745
Short GenUP-Llama3.2-1b 0.1830 0.1423 0.1011 0.0744 0.0762
Middle GenUP-Llama3.2-1b 0.2529 0.1730 0.1385 0.1844 0.1913
Long GenUP-Llama3.2-1b 0.3152 0.2384 0.1500 0.2459 0.2694

Generalization to Other Datasets

We also evaluate the generalization of our models trained on one dataset and used to infer the next POI for another dataset:

GenUP-Llama2-7b

Trained on NYC TKY CA
NYC 0.2575 0.1438 0.0920
TKY 0.2484 0.1699 0.0996
CA 0.2281 0.1446 0.1094

GenUP-Llama3.1-8b

Trained on NYC TKY CA
NYC 0.2127 0.1179 0.0787
TKY 0.1924 0.2582 0.0848
CA 0.1987 0.1197 0.1339

GenUP-Llama3.2-1b

Trained on NYC TKY CA
NYC 0.2484 0.1197 0.0787
TKY 0.1973 0.1851 0.0769
CA 0.2253 0.1236 0.1267

Ablation Study: User Profile Components

Finally, we conduct an ablation study to analyze the impact of different components of the user profiles on the next POI prediction:

Components Model NYC
Profile GenUP-Llama2-7b 0.2568
Profile + Routines & Preferences GenUP-Llama2-7b 0.2568
Profile + Routines & Preferences + Attributes GenUP-Llama2-7b 0.2575
Profile + Routines & Preferences + Attributes + BFI Traits GenUP-Llama2-7b 0.2575

Citation

If you find this repository useful for your research, please consider citing our paper:

@misc{wongso2024genupgenerativeuserprofilers,
  title={GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems}, 
  author={Wilson Wongso and Hao Xue and Flora D. Salim},
  year={2024},
  eprint={2410.20643},
  archivePrefix={arXiv},
  primaryClass={cs.IR},
  url={https://arxiv.org/abs/2410.20643}, 
}

Contact

If you have any questions or suggestions, feel free to contact Wilson at w.wongso(at)unsw(dot)edu(dot)au.

About

GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems (Wongso et al., 2024)

Topics

Resources

License

Stars

Watchers

Forks