Python Forum
Strange error ValueError: dimension mismatch
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Strange error ValueError: dimension mismatch
#1
I have to test word2vector model for text data similarity it generate this kind of error ValueError: dimension mismatch

from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences tokenizer=Tokenizer() tokenizer.fit_on_texts(documents_df.documents_cleaned) tokenized_documents=tokenizer.texts_to_sequences(documents_df.documents_cleaned) tokenized_paded_documents=pad_sequences(tokenized_documents,maxlen=64,padding='post') vocab_size=len(tokenizer.word_index)+1 # reading Glove word embeddings into a dictionary with "word" as key and values as word vectors embeddings_index = dict() with open("D:\Clustering\glove.6B.100d.txt", 'r', encoding="utf8") as file: for line in file: values = line.split() word = values[0] coefs = np.asarray(values[1:], dtype='float32') embeddings_index[word] = coefs # creating embedding matrix, every row is a vector representation from the vocabulary indexed by the tokenizer index. embedding_matrix=np.zeros((vocab_size,100)) for word,i in tokenizer.word_index.items(): embedding_vector = embeddings_index.get(word) if embedding_vector is not None: embedding_matrix[i] = embedding_vector # calculating average of word vectors of a document weighted by tf-idf document_embeddings=np.zeros((len(tokenized_paded_documents),100)) words=tfidfvectoriser.get_feature_names() # instead of creating document-word embeddings, directly creating document embeddings for i in range(documents_df.shape[0]): for j in range(len(words)): document_embeddings[i]+=embedding_matrix[tokenizer.word_index[words[j]]]*tfidf_vectors[i][j] pairwise_similarities=cosine_similarity(document_embeddings) pairwise_differences=euclidean_distances(document_embeddings)
Error:
Error:
Traceback (most recent call last): File "D:/Clustering/text-cluster-master/Cos_Sim_Eucliden_distance.py", line 156, in <module> document_embeddings[i] += embedding_matrix[tokenizer.word_index[words[j]]] * tfidf_vectors[i][j] File "D:\Python3.8.0\Python\lib\site-packages\scipy\sparse\base.py", line 550, in __rmul__ return (self.transpose() * tr).transpose() File "D:\Python3.8.0\Python\lib\site-packages\scipy\sparse\base.py", line 498, in __mul__ raise ValueError('dimension mismatch') ValueError: dimension mismatch
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  MariaDB Connector/Python; version mismatch shopgeek 1 1,435 Feb-24-2025, 05:06 AM
Last Post: from1991
  Strange argument count error rowan_bradley 3 2,313 Aug-06-2023, 10:58 AM
Last Post: rowan_bradley
  Array dimension don't match asja2010 0 2,950 Feb-23-2023, 04:22 PM
Last Post: asja2010
  x and y must have same first dimension, but have shapes (1,) and (50,) asja2010 5 6,798 Jan-12-2023, 07:24 PM
Last Post: deanhystad
  ValueError: Length mismatch: Expected axis has 8 elements, new values have 1 elements ilknurg 1 10,177 May-17-2022, 11:38 AM
Last Post: Larz60+
  ValueError: dimension mismatch Anldra12 0 5,063 Jul-17-2021, 04:46 PM
Last Post: Anldra12
  ValueError: x and y must have same first dimension, but have shapes (11,) and (15406, hobbyist 17 181,187 Mar-22-2021, 10:27 AM
Last Post: hobbyist
  Why getting ValueError : Math domain error in trig. function, math.asin() ? jahuja73 3 6,143 Feb-24-2021, 05:09 PM
Last Post: bowlofred
  Strange syntax error with f-strings Askic 6 10,746 Oct-16-2020, 10:40 AM
Last Post: Askic
  Error When Plotting ValueError: x and y must have same first dimension JoeDainton123 1 11,067 Oct-04-2020, 12:58 PM
Last Post: scidam

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.