The main issue while outsourcing the database is the information leakage which results in violation of both confidentiality and privacy of data. Both the papers provide schemes to prevent violation of these properties. These schemes involves encrypting the database which is stored on untrusted server and later querying the user substring over encrypted data efficiently. In addition to this, it also preserves privacy by hiding the substring query of the user from the untrusted server.
The outsourcing of database involves three parties i) Owner that outsources its database, ii) Client that wants to query the database and, iii) Untrusted server were the database is stored and also provides various services that can be used to perform operations on the database.
However, the problem of this kind of outsourcing is that the client trusts the database owner and not the server. The solution to this problem is to encrypt the database and store it in the untrusted server. But another problem arises that how should the client query the encrypted data and also ensure that the server does not have any idea of what the client is searching for.
One of the solution to this problem is searchable symmetric encryption were the queries are executed on the encrypted database. However achieving efficiency in searching using common indexing methods is quite difficult.
The two papers we have chosen take into consideration privacy preservation techniques as well as optimizing performance in databases. Both papers are related in that they both work with Searchable Symmetric Encryption.
The first paper proposes that there are various methods to maintain privacy in databases but they either leak information or cause false positive records in a complex database. The
authors propose a methodology to maintain privacy using Searchable Symmetric Encryption (SSE)
while ensuring optimal performance using tree-like indices.
The second paper tackles the issue of performing searches on sensitive data whilst being able to forward it to an untrusted source. In real world scenario, providing only security is not sufficient. Rather the requirement is to provide security efficiently. The authors highlight an issue with Searchable Symmetric Encryption (SSE) and improve it by transforming secure substring searches into range queries to be answered securely and efficiently.
The paper aims to reduce range search to multi-keyword search between an untrusted server and a data owner. The goal of the paper is to run multi-keyword search on encrypted data sent by the data owner while achieving performance costs close to those running the same search on unencrypted data while overcoming the shortcomings of prior works on the same topic. Converting a range search to multi-keyword search means converting a range to sub-ranges with each sub-range identifying a node in a tree-like index and as a keyword.
Although the idea of tree-like indexes can cause some structural leakage, it provides the benefit of any black box being able to use this technique as it is quite simple. Similarly, when it comes to database updates, each batch of updates is treated as an independent instance of the dataset and periodically, multiple datasets combine into one dataset. The base idea of different techniques of encryption are discussed such as Order Preserving Encryption (OPE), Deterministic Encryption (DET) and Probabilistic Encryption along with their drawbacks such as frequency and structural leakage. The work closest to this paper is one that introduces the idea of a binary tree over the query attribute domain. For every record in the dataset, the log m dyadic range is calculated over the attribute value.
Table 1: Summary of our RSSE schemes and analytical comparison to our closest competitor. Here the schemes are named after the storage expansion factor and the range covering technique. A higher value in the Security column means better security guarantees.
According to Li et al., a range query is answered by splitting the specified range into its O(log R) minimal dyadic ranges. R here is the range size over the domain and traversing the tree and checking if the Bloom filter in a node contains one of the values from the minimum dyadic range. The table shows the costs incurred by using each of the RSSE schemes. The scheme sets a maximum allowable ratio of false positives for itself which causes there to be a storage cost of O(n log n log m) and false positives to O(R). Here n is the dataset size and r is the result size.
It is difficult to find out the actual search time but it is assumed to be O(r log n). While the performance of this Li et al.’s scheme is admirable, its security stance is not optimal. This paper presents Logarithmic-BRC and Logarithmic-URC that take the said paper’s approach one step further. One important goal here is to achieve forward privacy, that is, the server should not find out if a new item has been added to the database that was not previously there.
Two range covering techniques are discussed, namely BRC and URC. BRC stands for best range cover and URC stands for uniform range cover. Best range cover selects the minimum number of nodes that cover a given range completely. This amounts to O(log R) nodes because a binary tree can take log R time or nodes to traverse based on the search algorithm used. Based on which range of nodes is being considered in the binary tree, the algorithm can reveal more or less data as shown in Figure 1.
Essentially, URC covers a much larger range as it starts with a node in a range and keeps breaking it down until the values are contiguous and thus, it covers a given range completely. After this, a PRF and DPRF are explained, the difference between the two being that in a DPRF, the party holding a key can allow another party to derive DPRF values in a given domain A.
One of the components in the implementation of Range Searchable Symmetric Encryption is by taking an input key k and a range query instead of the keyword w but divided into keywords that output tokens. The authors define their process of generating Range Searchable Symmetric Encrypted documents (or tuples). They provide these 4 steps as constituents of this process:
· Index creation: Break the domain A into contiguous ranges and every range is assigned a unique keyword. Every tuple (or document) is divided into its respective keywords.
· Query: Break the query range into subranges and generate tokens for searching the SSE index.
· Security: Define the functions for data leakage caused by the keyword mapping and index structure
· Updates: Run updates in the form of batches and every so often combine them into one index. At this point, the owner should create a combined index as well. 
Throughout this process, 4 algorithms are used that are part of the SSE security game: Setup, BuildIndex, Trpdr and Search. The Quadratic scheme starts by creating m2 range queries that can be applied to the dataset. Here, “m” is the total number of attributes in the query attribute domain. All possible subranges of A are enumerated and each is assigned a keyword. Each tuple is assigned its respective keywords and replicated tuples are added to a different dataset. After this, the owner runs an algorithm to take an input parameter ? and outputs a key k. The BuildIndex function is then run which takes the secret key k from the previous step and the data collection D and outputs an index I. This is then sent to the server with the encrypted documents. Following this, query tokens are generated when the owner issues a query to run on the database. This function takes in the key k and keyword w. The searching function is then run on the database and takes in the tokens t and index I. The output is a set of documents X.
The Constant-BRC scheme assigns a single keyword to a full tuple that relates to its value in A. The database is then indexed with a Searchable Symmetric Encryption scheme that outputs an index I that has a size of O(n) meaning linear with respect to the number of tuples. A query of size R is mapped into R keywords which is to say that since each keyword corresponds to one tuple, this makes searching easier. Since each keyword in R will bring back an exact/complete tuple that relates to it, there will be no false positives and the query will run in O(r + R) time.
The Constant-BRC and Constant-URC schemes are not secure against adaptive adversaries, that is to say, those that can issue intersecting range queries. Intersecting range queries are comparisons between two values such that a value can lie between two ranges. In order to issue intersecting range queries, the query simulator must have a priori knowledge of the index, that is to say that since these kinds of queries are essential to the functioning of a database, a priori knowledge of the database will be known by the simulator. In order to overcome this shortcoming of the Constant schemes, the authors propose Logarithmic-BRC and Logarithmic-URC schemes.
In this category of RSSE schemes, each tuple is assigned a logarithmic number of keywords. In the BuildIndex function of this algorithm, each replica is permuted with the keyword associated with it. The same for the Trpdr function: The nodes that cover the keyword are found using BRC or URC and the token is permuted with the nodes associated and returned.
Logarithmic-BRC and URC result in more storage cost but provide better searching capability. Another benefit to logarithmic schemes is that they provide better data concealment because there is a certain factor of randomness associated with it compared to the constant number of keywords. One thing that neither type of schemes hides is the partitions between the tuples.
Logarithmic-SRC is another variant that prevents leakage by covering the query range with a single range. Meanwhile, a variant is defined that creates a double index to reduce false positives. This increases communication by one round.
An experiment undertaken on this kind of searchable encryption shows that the number of keys, cost of storage, size of result and search time increase linearly with the number of tokens as each tuple is assigned a separate keyword too.
The construction time for logarithmic indexes and dataset size is greater than that for constant schemes.
The figure shows decrease in false positives by using the interactive scheme.
A comparison of the search times observed between the algorithms using the Gowalla and USPS datasets. Logarithmic-BRC and URC demonstrate the greatest search performance although Constant schemes demonstrate optimal storage. To conclude the summary, a few Searchable Symmetric Encryption schemes are presented that offer a variety of benefits and trade-offs such as some being more secure by hiding data while incurring greater
👋 Hi! I’m your smart assistant Amy!
Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.get help with your assignment