QQ登录

只需一步,快速开始

Explain why computing the proximity between two attributes is often simpler t...

[复制链接]
admin 发表于 2022-4-4 19:18:06 [显示全部楼层] 回帖奖励 倒序浏览 阅读模式 1 1896
  • Explain why computing the proximity between two attributes is often simpler than computing the similarity between two objects?

    Proximity between two objects :
    The term proximity between two objects is a function of the proximity between the corresponding attributes of the two objects. Similarity and Dissimilarity are important because they are used by a number of data mining techniques, such as clustering, nearest neighbour classification, and anomaly detection.
    We will start the discussion with high-level definitions and explore how they are related. Then, we move forward to talk about Proximity in two data objects with one simple attribute and moving to objects with multiple attributes.
    Transformation Function:
    It is a function used to convert similarity to dissimilarity and vice versa, or to transform a proximity measure to fall into a particular range. For instance:
    s’ = (s-min(s)) / max(s)-min(s))
    where,
    s’ = new transformed proximity measure value,
    s = current proximity measure value,
    min(s) = minimum of proximity measure values,
    max(s) = maximum of proximity measure values
    This transformation function is just one example from all the available options out there.
    Where as Similarity between two objects
    The similarity measure is the measure of how much alike two data objects are. A similarity measure is a data mining or machine learning context is a distance with dimensions representing features of the objects. If the distance is small, the features are having a high degree of similarity. Whereas a large distance will be a low degree of similarity.
    Similarity measure usage is more in the text related preprocessing techniques, Also the similarity concepts used in advanced word embedding techniques. We can use these concepts in various deep learning applications. Uses the difference between the image for checking the data created with data augmentation techniques.
    For example, two fruits are similar because of color or size or taste. Special care should be taken when calculating distance across dimensions/features that are unrelated. The relative values of each element must be normalized, or one feature could end up dominating the distance calculation.
    Generally, similarity are measured in the range 0 to 1 [0,1]. In the machine learning world, this score in the range of [0, 1] is called the similarity score.
    Two main consideration of similarity:
    • Similarity = 1 if X = Y (Where X, Y are two objects)
    • Similarity = 0 if X ≠ Y
    The main reason why proximity is better is because it is not subjective and is not dependent on the domain and application.

    Comment
回复

使用道具 举报

已有(1)人评论

跳转到指定楼层
admin 发表于 2022-5-4 11:15:27
Answer:
Explained below.

Source
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

官方微博
官方微博
模板大全
模板大全
意见
反馈