Creat membership Creat membership
Sign in

Forgot password?

Confirm
  • Forgot password?
    Sign Up
  • Confirm
    Sign In
home > search

Now showing items 1 - 16 of 133

  • Special issue on challenges in knowledge discovery and data mining

    Shusaku Tsumoto  

    Download Collect
  • Detection of risk factors using trajectory mining

    Shusaku Tsumoto   Shoji Hirano  

    This paper proposes a method for grouping trajectories as two-dimensional time-series data. Our method employed a two-stage approach. Firstly, it compared two trajectories based on their structural similarity, and determines the best correspondence of partial trajectories. Then, it calculated the value-based dissimilarity for the all pairs of matched segments, and outputs their total sum as the dissimilarity of two trajectories. We evaluated this method on two data sets. Experimental results on the Australia sign language dataset and chronic hepatitis dataset demonstrate that our method could capture the structural similarity between trajectories even in the presence of noise and local differences, and could provide better proximity for discriminating objects.
    Download Collect
  • Detection of risk factors using trajectory mining

    Shusaku Tsumoto   Shoji Hirano  

    This paper proposes a method for grouping trajectories as two-dimensional time-series data. Our method employed a two-stage approach. Firstly, it compared two trajectories based on their structural similarity, and determines the best correspondence of partial trajectories. Then, it calculated the value-based dissimilarity for the all pairs of matched segments, and outputs their total sum as the dissimilarity of two trajectories. We evaluated this method on two data sets. Experimental results on the Australia sign language dataset and chronic hepatitis dataset demonstrate that our method could capture the structural similarity between trajectories even in the presence of noise and local differences, and could provide better proximity for discriminating objects.
    Download Collect
  • Special issue on data mining for decision making and risk management

    Shusaku Tsumoto   Tzung-Pei Hong  

    Download Collect
  • Special issue on data mining for decision making and risk management

    Shusaku Tsumoto   Tzung-Pei Hong  

    Download Collect
  • Residual Analysis of Statistical Dependence in Multiway Contingency Tables

    Shusaku Tsumoto   Shoji Hirano  

    A Pearson residual is defined as the residual between actual values and expected ones of each cell in a contingency table. This paper shows that this residual is represented as linear sum of determinants of 2 脳2, which suggests that the geometrical nature of the residuals can be viewed from grasmmanian algebra.
    Download Collect
  • Contingency matrix theory: Statistical dependence in a contingency table

    Shusaku Tsumoto  

    Chance discovery aims at understanding the meaning of functional dependency from the viewpoint of unexpected relations. One of the most important observations is that such a chance is hidden under a huge number of coocurrencies extracted from a given data. On the other hand, conventional data-mining methods are strongly dependent on frequencies and statistics rather than interestingness or unexpectedness. This paper discusses some limitations of ideas of statistical dependence, especially focusing on the formal characteristics of Simpson’s paradox from the viewpoint of linear algebra. Theoretical results show that such a Simpson’s paradox can be observed when a given contingency table as a matrix is not regular, in other words, the rank of a contingency matrix is not full. Thus, data-ordered evidence gives some limitations, which should be compensated by human-oriented reasoning.
    Download Collect
  • Contingency matrix theory: Statistical dependence in a contingency table

    Shusaku Tsumoto  

    Chance discovery aims at understanding the meaning of functional dependency from the viewpoint of unexpected relations. One of the most important observations is that such a chance is hidden under a huge number of coocurrencies extracted from a given data. On the other hand, conventional data-mining methods are strongly dependent on frequencies and statistics rather than interestingness or unexpectedness. This paper discusses some limitations of ideas of statistical dependence, especially focusing on the formal characteristics of Simpson’s paradox from the viewpoint of linear algebra. Theoretical results show that such a Simpson’s paradox can be observed when a given contingency table as a matrix is not regular, in other words, the rank of a contingency matrix is not full. Thus, data-ordered evidence gives some limitations, which should be compensated by human-oriented reasoning.
    Download Collect
  • Detection of Risk Factors as Temporal Data Mining

    Shoji Hirano   Shusaku Tsumoto  

    Hosptial information system (HIS) collects all the data from all the branches of departments in a hospital, including laboratory tests,physiological tests, electronic patient records. Thus, HIS can be viewed as a large heterogenous database, which stores chronological changes in patients’ status. In this paper, we applied trajectory mining method to the data extracted from HIS. Experimental results demonstrated that the method could find the groups of trajectories which reflects temporal covariance of laboratory examinations.
    Download Collect
  • International Workshop on Risk Informatics (RI2007)

    Takashi Washio   Shusaku Tsumoto  

    Along the enhancement of our social life level, people became to pay more attention to the risk of our society to ensure our life very safe. Under this increasing demand, modern science and engineering now have to provide efficient measures to reduce our social risk in various aspects. On the other hand, the accumulation of a large amount of data on our activities is going on under the introduction of information technology to our society. This data can be used to efficiently manage the risks in the society. The Workshop on Risk Mining 2006 (RM2006) was held in June, 2006 based on these demand and situation while focusing the risk management based on data mining techniques [1,2]. However, the study of the risk management has a long history on the basis of mathematical statistics, and the mathematical statistics is now making remarkable progress in the data analysis field. The successive workshop in this year, the International Workshop on Risk Informatics (RI2007), extended its scope to include the risk management by the data analysis based on both data mining and mathematical statistics.
    Download Collect
  • Trajectory Analysis of Laboratory Tests as Medical Complex Data Mining

    Shoji Hirano   Shusaku Tsumoto  

    Finding temporally covariant variables is very important for clinical practice because we are able to obtain the measurements of some examinations very easily, while it takes a long time for us to measure other ones. Also, unexpected covariant patterns give us new knowledge for temporal evolution of chronic diseases. This paper focuses on clustering of trajectories of temporal sequences of two laboratory examinations. First, we map a set of time series containing different types of laboratory tests into directed trajectories representing temporal change in patients’ status. Then the trajectories for individual patients are compared in multiscale and grouped into similar cases by using clustering methods. Experimental results on the chronic hepatitis data demonstrated that the method could find the groups of trajectories which reflects temporal covariance of platelet, albumin and choline esterase.
    Download Collect
  • Attribute Generalization and Fuzziness in Data Mining Contexts

    Shusaku Tsumoto  

    This paper shows problems with combination of rule induction and attribute-oriented generalization, where if the given hierarchy includes inconsistencies, then application of hierarchical knowledge generates inconsistent rules. Then, we introduce two approaches to solve this problem, one process of which suggests that combination of rule induction and attribute-oriented generalization can be used to validate concept hiearchy. Interestingly, fuzzy linguistic variables play an important role in solving these problems.
    Download Collect
  • Medical Reasoning and Rough Sets

    Shusaku Tsumoto  

    Pawlak showed that knowledge can be captured by data partition and proposed a rough set method where comparison between data partition gives knowledge about classification. Interestingly, thes approximations correspond to the focusing mechanism of differential medical diagnosis; upper approximation as selection of candidates and lower approximation as concluding a final diagnosis. This paper focuses on severl models of medical reasoning shows that core ideas of rough set theory can be observed in these diagnostic models.
    Download Collect
  • Risk Mining - Overview

    Shusaku Tsumoto   Takashi Washio  

    International workshop on Risk Mining (RM2006) was held in conjunction with the 20th Annual Conference of the Japanese Society for Artificial Intelligence(JSAI2005), Tokyo Japan, June 2005. The workshop aimed at sharing and comparing experiences on risk mining techniques applied to risk detection, risk clarification and risk utilization. In summary, the workshop gave a discussion forum for researchers working on both data mining and risk management where the attendees discussed various aspects on data mining based risk management.
    Download Collect
  • Statistical Independence of Multi-variables from the Viewpoint of Linear Algebra

    Shusaku Tsumoto   Shoji Hirano  

    This paper focuses on statistical independence of three variables from the viewpoint of linear algebra. While information granules of statistical independence of two variables can be viewed as determinants of 2 脳2- submatrices, those of three variables consist of linear combination of odds ratios.
    Download Collect
  • Interpretation of Contingency Matrix Using Marginal Distributions

    Shusaku Tsumoto   Shoji Hirano  

    This paper shows formal analysis of a contingency table based on its marginal distributions. The main approach is to make an expected matrix from two given marginal distributions and take the difference between original cell values and expected values to construct a residual matrix. The most important characeristics of a residual matrix are following: (1) Its determinant is equal to 0, which implies the rank of this matrix is less than the rank of an original matrix. (2) Each element of a residual matrix can be represented as a linear combination of 2 脳2 subderminants. These characteristics shows that the residual of a contingency matrix is closely related with 2 脳2 subderminants, which also shows that the χ 2 test statistic is a function of 2 脳2 subderminants and marginal sums and suggests that distribution of determinants should have an important meaning for this statistic.
    Download Collect
1 2 3 4 5 6 7 8 9

Contact

If you have any feedback, Please follow the official account to submit feedback.

Turn on your phone and scan

Submit Feedback