Multi-Keyword Top-K Similarity Search With A Privacy Preserving Over Encrypted Data
Main Article Content
Abstract
In order to support a variety of big data applications in industries like health care and scientific research,
cloud computing offers individuals and businesses enormous computing power and scalable storage
capacities. As a result, an increasing number of data owners are outsourcing their data to cloud servers for
great convenience in data management and mining. However, sensitive information is frequently present in
data sets like health records in electronic documents, raising privacy problems if the documents are made
public or shared with partially untrusted third parties on the cloud. The multi-keyword top-k search challenge
for big data encryption against privacy breaches is examined in this project in an effort to provide an
effective and safe solution. We specifically build a special tree-based index structure and random traversal
algorithm for the privacy concern of query data, which makes even the same query produce different visiting
paths on the index and can also keep query accuracy unchanged under stronger privacy. We suggest a group
multi-keyword top-k search method based on the concept of partition, where a group of tree-based indexes is
built for all documents, in order to increase query efficiency. Finally, we combine these techniques into a
secure and effective strategy to tackle our suggested top-k similarity search. Extensive experimental results
on real-world data sets show that our suggested strategy can greatly outperform state-of-the-art techniques in
terms of privacy protection, scalability, and query processing performance.