2020 PatentDocumentClusteringwithDee

Subject Headings: Text Embedding Clustering, Patent Clustering.

Notes

It proposes a method for automatically clustering patent documents using deep learning techniques.
It uses a neural embedding approach called Doc2Vec to convert the text of patent abstracts into embedding vectors.
It then applies a modified deep embedded clustering (DEC) algorithm to cluster the patent embeddings.
It compares performance to traditional clustering methods like k-means on tf-idf and bag-of-words features.
It finds the proposed Doc2Vec + DEC method achieves higher accuracy than the baselines.
It visualizes the embeddings using t-SNE to show the DEC optimization process increases within-cluster coupling.
It discusses the improved performance is due to strengthening similarity and optimizing cluster boundaries.
It highlights the efficiency gains of using negative sampling and KL divergence in DEC over methods like t-SNE.
It concludes the deep learning approach shows promise for patent analysis tasks like clustering.
It suggests future work on incorporating patent metadata, improving speed for full documents, and data visualization applications.

;

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
2020 PatentDocumentClusteringwithDee	Jaeyoung Kim Janghyeok Yoon Eunjeong Park Sungchul Choi			Patent Document Clustering with Deep Embeddings				10.1007/s11192-020-03396-7		2020