myacademicproject logo

BIG DATA

  • myacademicproject categorys

    Attribute-Based Storage Supporting Secure Deduplication of Encrypted Data in Cloud

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 11 Months ago

    Attribute-based encryption (ABE) has been widely used in cloud computing where a data provider outsources his/her encrypted data to a cloud service provider, and can share the data with users possessing specific credentials (or attributes). However, the standard ABE system does not support secure deduplication, which is crucial for eliminating duplicate copies of identical data in order to save storage space and network bandwidth.

    Know More
  • myacademicproject categorys

    HDM: A Composable Framework for Big Data Processing

    • B.Tech / M.Tech / M.Sc / Ph.D

    •  GANGADHAR.T

    • 12 Months ago

    Over the past years, frameworks such as MapReduce and Spark have been introduced to ease the task of developing big data programs and applications.

    Know More
  • myacademicproject categorys

    Heterogeneous Data Storage Management with Deduplication in Cloud Computing

    • B.Tech / M.Tech / M.Sc / Ph.D

    •  GANGADHAR.T

    • 12 Months ago

    Cloud storage as one of the most important services of cloud computing helps cloud users break the bottleneck of restricted resources and expand their storage without upgrading their devices. In order to guarantee the security and privacy of cloud users, data are always outsourced in an encrypted form.

    Know More
  • myacademicproject categorys

    Secure k-NN Query on Encrypted Cloud Data with Multiple Keys

    • B.Tech / M.Tech / M.Sc / Ph.D

    •  GANGADHAR.T

    • 13 Months ago

    The k-nearest neighbors (k-NN) query is a fundamental primitive in spatial and multimedia databases. It has extensive applications in location-based services, classification & clustering and so on

    Know More
  • myacademicproject categorys

    NPP A New Privacy-Aware Public Auditing Scheme for Cloud Data Sharing with Group Users

    • B.Tech / M.Tech / M.Sc / Ph.D

    •  GANGADHAR.T

    • 12 Months ago

    Today, cloud storage becomes one of the critical services, because users can easily modify and share data with others in cloud. However, the integrity of shared cloud data is vulnerable to inevitable hardware faults, software failures or human errors

    Know More
  • myacademicproject categorys

    Online Data Deduplication for In-Memory Big-Data Analytic Systems

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 12 Months ago

    Given a set of files that show a certain degree of similarity, we consider a novel problem of performing data redundancy elimination across a set of distributed worker nodes in a shared-nothing in-memory big data analytic system. The redundancy elimination scheme is designed in a manner that is: (i) space-efficient: the total space needed to store the files is minimized and, (ii) access-isolation

    Know More
  • myacademicproject categorys

    Big Data Set Privacy Preserving through Sensitive Attribute-based Grouping

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 13 Months ago

    There is a growing trend towards attacks on database privacy due to great value of privacy information stored in big data set. Public’s privacy are under threats as adversaries are continuously cracking their popular targets such as bank accounts. We find a fact that existing models such as K-anonymity, group records based on quasi-identifiers, which harms the data utility a lot. Motivated by this, we propose a sensitive attribute-based privacy model.

    Know More
  • myacademicproject categorys

    Designing Self-Tuning Split-Map-Merge Applications for High Cost-Efficiency in the Cloud

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    Cloud platforms are attractive for executing large concurrent applications that require access to a pool of resources for concurrently executing the partitions of their workloads. Historically, application designers have tuned concurrent applications for specific hardware and platforms. But such approaches are not viable in cloud platforms as applications can be deployed on a variety of platforms and the operating environments can vary in each deployment. In this work, we argue and demonstrate that concurrent applications in cloud platforms must be self-tuning. First, we show that applications must incorporate a model of the overheads of operation.

    Know More
  • myacademicproject categorys

    HDM A Composable Framework for Big Data Processing

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 12 Months ago

    Over the past years, frameworks such as MapReduce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix

    Know More
  • myacademicproject categorys

    Attribute-Based Storage Supporting Secure Deduplication of Encrypted Data in Cloud

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 13 Months ago

    Attribute-based encryption (ABE) has been widely used in cloud computing where a data provider outsources his/her encrypted data to a cloud service provider, and can share the data with users possessing specific credentials (or attributes). However, the standard ABE system does not support secure deduplication, which is crucial for eliminating duplicate copies of identical data in order to save storage space and network bandwidth

    Know More
  • myacademicproject categorys

    OverFlow Multi-Site Aware Big Data Management for Scientific Workflows on Clouds

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 13 Months ago

    The global deployment of cloud datacenters is enabling large scale scientific workflows to improve performance and deliver fast responses. This unprecedented geographical distribution of the computation is doubled by an increase in the scale of the data handled by such applications, bringing new challenges related to the efficient data management across sites. High throughput, low latencies or cost-related trade-offs are just a few concerns for both cloud providers and users when it comes to handling data across datacenters

    Know More
  • myacademicproject categorys

    H2Hadoop: Improving Hadoop Performance using the Metadata of Related Jobs

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    Cloud Computing leverages Hadoop framework for processing BigData in parallel. Hadoop has certain limitations that could be exploited to execute the job efficiently. These limitations are mostly because of data locality in the cluster, jobs and tasks scheduling, and resource allocations in Hadoop. Efficient resource allocation remains a challenge in Cloud Computing MapReduce platforms. We propose H2Hadoop,

    Know More
  • myacademicproject categorys

    Dynamic Job Ordering and Slot Configurations for MapReduce Workloads

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and data centers. A MapReduce workload generally contains a set of jobs, each of which consists of multiple map tasks followed by multiple reduce tasks. Due to 1) that map tasks can only run in map slots and reduce

    Know More
  • myacademicproject categorys

    Operational – Log Analysis for Big Data Systems

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    Big data systems (BDSs) are complex, consisting of multiple interacting hardware and software components, such as distributed computing nodes, databases, and middleware. Any of these components can fail. Finding the failures root causes is extremely laborious. Analysis of BDS-generated logs can speed up this process.

    Know More
  • myacademicproject categorys

    Hybrid Job-Driven Scheduling for Virtual MapReduce Clusters

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    It is cost-efficient for a tenant with a limited budget to establish a virtual MapReduce cluster by renting multiple virtual private servers (VPSs) from a VPS provider. To provide an appropriate scheduling scheme for this type of computing environment, we propose in this paper a hybrid job-driven scheduling scheme (JoSS for short) from a tenant’s perspective. JoSS provides not only joblevel scheduling, but also map-task level

    Know More
  • myacademicproject categorys

    A Parallel Patient Treatment Time Prediction Algorithm and Its Applications in Hospital Queuing-Recommendation in a Big Data Environment

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 13 Months ago

    Effective patient queue management to minimize patient wait delays and patient overcrowding is one of the major challenges faced by hospitals. Unnecessary and annoying waits for long periods result in substantial human resource and time wastage and increase the frustration endured by patients. For each patient in the queue, the total treatment time of all the patients before him is the time that he must wait. It would be convenient and preferable if the patients could receive the most efficient treatment plan and know the predicted waiting

    Know More
  • myacademicproject categorys

    On Traffic-Aware Partition and Aggregation in MapReduce for Big Data Applications

    • B.TECH / M.TECH / M.SC / PH.D

    •  GANGADHAR.T

    • 13 Months ago

    The MapReduce programming model simplifies large-scale data processing on commodity cluster by exploiting parallel map tasks and reduce tasks. Although many efforts have been made to improve the performance of MapReduce jobs, they ignore the network traffic generated in the shuffle phase, which plays a critical role in

    Know More
  • myacademicproject categorys

    A Time Efficient Approach for Detecting Errors in Big Sensor Data on Cloud

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    Big sensor data is prevalent in both industry and scientific research applications where the data is generated with high volume and velocity it is difficult to process using on-hand database management tools or traditional data processing applications. Cloud computing provides a promising platform to support the addressing of this challenge as it provides a flexible stack of massive computing, storage, and software services in a scalable manner at low cost. Some techniques have been developed in recent years for processing

    Know More
  • myacademicproject categorys

    BFC High-Performance Distributed Big-File Cloud Storage Based On Key-Value Store

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    Nowadays, cloud-based storage services are rapidly growing and becoming an emerging trend in data storage field. There are many problems when designing an efficient storage engine for cloud-based systems with some requirements such as big-file processing, lightweight meta-data, low latency, parallel I/O, deduplication, distributed, high scalability. Key-value stores played an important role and showed many advantages when solving those problems.

    Know More
  • myacademicproject categorys

    Data Mining with Big Data

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data

    Know More
  • myacademicproject categorys

    A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 13 Months ago

    A large number of cloud services require users to share private data like electronic health records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via generalization to satisfy certain privacy requirements such as k-anonymity is a widely used category of privacy preserving techniques. At present, the scale of data in many cloud applications increases tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly

    Know More
  • myacademicproject categorys

    User-Defined Privacy Grid System for Continuous Location-Based Services

    • B.TECH / M.TECH / M.SC / PH.D

    •   GANGADHAR.T

    • 12 Months ago

    Due to the popularity of mobile devices (e.g., cell phones, PDAs, etc.), location-based services have become more and more prevalent in recent years. However, users have to reveal their location information to access location-based services with existing service infrastructures. It is possible that adversaries could collect the location information, which in turn invades user’s privacy. There are existing solutions for query processing on

    Know More