Skip to content

Conversation

@jmorlock
Copy link

@jmorlock jmorlock commented Jan 25, 2025

There are model parameters where the matrix factorization of BayesianPersonalizedRanking fails.
In this case some (or all) entries of the user and the item matrix become NaN.

While this applies to both the CPU and the GPU version, the CPU version already features a corresponding check. In this pull request I added a similar check to the GPU version and consolidated the source code.

Side Note: Not having this check can be quite misleading. Because in this case a strange behavior can be observed:
no error occurs but recommend returns items the user already liked even with filter_already_liked_items set to True. It can be verified using the following test:

import implicit import numpy as np import scipy.sparse as sparse def test_matrix_nan(): num_users = 2 num_items = 4 factors = 3 # customer 0 liked item 0 and 1 customers = np.array([0, 0, 1, 1]) items = np.array([0, 1, 2, 3]) quantity = np.ones(len(items)) user_items = sparse.csr_matrix((quantity, (customers, items))) user_factors = implicit.gpu._cuda.Matrix( np.full((num_users, factors), np.nan, dtype=np.float32) ) item_factors = implicit.gpu._cuda.Matrix( np.full((num_items, factors), np.nan, dtype=np.float32) ) # simulate a failed fit by setting both matrices to NaN model = implicit.gpu.bpr.BayesianPersonalizedRanking() model.user_factors = user_factors model.item_factors = item_factors (ids, scores) = model.recommend( userid=0, user_items=user_items[0], N=1, filter_already_liked_items=True, filter_items=None, recalculate_user=False, items=None, ) assert ids[0] not in {0, 1} # FAILS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant