Creat membership Creat membership
Sign in

Forgot password?

Confirm
  • Forgot password?
    Sign Up
  • Confirm
    Sign In
home > search

Now showing items 1 - 16 of 135

  • On sensor selection in linked information networks

    Aggarwal, Charu C.   Bar-Noy, Amotz   Shamoun, Simon  

    Download Collect
  • Improving Classification Quality in Uncertain Graphs

    Dallachiesa, Michele   Aggarwal, Charu C.   Palpanas, Themis  

    Download Collect
  • On the anonymizability of graphs

    Aggarwal, Charu C.   Li, Yao   Yu, Philip S.  

    Many applications such as social networks, recommendation systems, email communication patterns, and other collaborative applications are built on top of graph infrastructures. The data stored on such networks may contain personal information about individuals and may therefore be sensitive from a privacy point of view. Therefore, a natural solution is to remove identifying information from the nodes and perturb the graph structure, so that re-identification becomes difficult. Typical graphs encountered in real applications are sparse. In this paper, we will show that sparse graphs have certain theoretical properties which make them susceptible to re-identification attacks. We design a systematic way to exploit these theoretical properties in order to construct re-identification signatures, which are also known as characteristic vectors. These signatures have the property that they are extremely robust to perturbations, especially for sparse graphs. We use these signatures in order to create an effective attack algorithm. We supplement our theoretical results with experimental tests using a number of algorithms on real data sets. These results confirm that even low levels of anonymization require perturbation levels which are significant enough to result in a massive loss of utility. Our experimental results also show that the true anonymization level of graphs is much lower than is implied by measures such as \(k\)-anonymity. Thus, the results of this paper establish that the problem of graph anonymization has fundamental theoretical barriers which prevent a fully effective solution.
    Download Collect
  • Theoretical Foundations and Algorithms for Outlier Ensembles?

    Aggarwal, Charu C.   Sathe, Saket  

    Download Collect
  • A framework for dynamic link prediction in heterogeneous networks

    Aggarwal, Charu C.   Xie, Yan   Yu, Philip S.  

    Network and linked data have become quite prevalent in recent years because of the ubiquity of the web and social media applications, which are inherently network oriented. Such networks are massive, dynamic, contain a lot of content, and may evolve over time. In this paper, we will study the problem of efficient dynamic link inference in temporal and heterogeneous information networks. The problem of efficiently performing dynamic link inference is extremely challenging in massive and heterogeneous information network because of the challenges associated with the dynamic nature of the network, and the different types of nodes and attributes in it. Both the topology and type information need to be used effectively for the link inference process. We propose an effective two-level scheme which makes efficient macro- and micro-decisions for combining structure and content in a dynamic and time-sensitive way. The time-sensitive nature of the links is leveraged in order to perform effective link prediction. We will also study how to apply the method to the problem of community prediction. We illustrate the effectiveness of our technique over a number of real data sets. Statistical Analysis and Data Mining 2013 DOI: 10.1002/sam.11198
    Download Collect
  • On the Use of Side Information for Mining Text Data

    Aggarwal, Charu C.   Yuchen Zhao,    Yu, Philip S.  

    In many text mining applications, side-information is available along with the text documents. Such side-information may be of different kinds, such as document provenance information, the links in the document, user-access behavior from web logs, or other non-textual attributes which are embedded into the text document. Such attributes may contain a tremendous amount of information for clustering purposes. However, the relative importance of this side-information may be difficult to estimate, especially when some of the information is noisy. In such cases, it can be risky to incorporate side-information into the mining process, because it can either improve the quality of the representation for the mining process, or can add noise to the process. Therefore, we need a principled way to perform the mining process, so as to maximize the advantages from using this side information. In this paper, we design an algorithm which combines classical partitioning algorithms with probabilistic models in order to create an effective clustering approach. We then show how to extend the approach to the classification problem. We present experimental results on a number of real data sets in order to illustrate the advantages of using such an approach.
    Download Collect
  • Towards graphical models for text processing

    Aggarwal, Charu C.   Zhao, Peixiang  

    The rapid proliferation of the World Wide Web has increased the importance and prevalence of text as a medium for dissemination of information. A variety of text mining and management algorithms have been developed in recent years such as clustering, classification, indexing, and similarity search. Almost all these applications use the well-known vector-space model for text representation and analysis. While the vector-space model has proven itself to be an effective and efficient representation for mining purposes, it does not preserve information about the ordering of the words in the representation. In this paper, we will introduce the concept of distance graph representations of text data. Such representations preserve information about the relative ordering and distance between the words in the graphs and provide a much richer representation in terms of sentence structure of the underlying data. Recent advances in graph mining and hardware capabilities of modern computers enable us to process more complex representations of text. We will see that such an approach has clear advantages from a qualitative perspective. This approach enables knowledge discovery from text which is not possible with the use of a pure vector-space representation, because it loses much less information about the ordering of the underlying words. Furthermore, this representation does not require the development of new mining and management techniques. This is because the technique can also be converted into a structural version of the vector-space representation, which allows the use of all existing tools for text. In addition, existing techniques for graph and XML data can be directly leveraged with this new representation. Thus, a much wider spectrum of algorithms is available for processing this representation. We will apply this technique to a variety of mining and management applications and show its advantages and richness in exploring the structure of the underlying text documents.
    Download Collect
  • Machine Learning for Text || Text Summarization

    Aggarwal, Charu C.  

    Download Collect
  • Machine Learning for Text || Text Segmentation and Event Detection

    Aggarwal, Charu C.  

    Download Collect
  • Machine Learning for Text || Opinion Mining and Sentiment Analysis

    Aggarwal, Charu C.  

    Download Collect
  • Neural Networks and Deep Learning (A Textbook) || Recurrent Neural Networks

    Aggarwal, Charu C.  

    Download Collect
  • Outlier Ensembles ||

    Aggarwal, Charu C.   Sathe, Saket  

    Download Collect
  • Frequent Pattern Mining || An Introduction to Frequent Pattern Mining

    Aggarwal, Charu C.   Han, Jiawei  

    Download Collect
  • Frequent Pattern Mining || Sequential Pattern Mining

    Aggarwal, Charu C.   Han, Jiawei  

    Download Collect
  • Frequent Pattern Mining || Interesting Patterns

    Aggarwal, Charu C.   Han, Jiawei  

    Download Collect
  • Frequent Pattern Mining || Negative Association Rules

    Aggarwal, Charu C.   Han, Jiawei  

    Download Collect
1 2 3 4 5 6 7 8 9

Contact

If you have any feedback, Please follow the official account to submit feedback.

Turn on your phone and scan

Submit Feedback