Nihil Obstat: mayo 2010

28.5.10

CFP: International Workshop on Topic Feature Discovery and Opinion Mining (TFDOM'10)

International Workshop on Topic Feature Discovery and Opinion Mining (TFDOM'10)
Joint with the 10th IEEE International Conference on Data Mining (ICDM'10)
December 13, 2010, Sydney, Australia

Dates

Full paper submission deadline: *** July 23, 2010 ***
Notification of acceptance: September 20, 2010
Camera-ready of accepted papers: October 11, 2010
Workshop: December 13, 2010

Textual data in the world can be roughly categorized into two main types: facts and opinions. Much effort has been devoted to fact-based information processing in the past decades, and many useful techniques have been developed for information retrieval or text mining. In recent years opinion-based information processing has also been receiving increasingly more attention from researchers. Understanding people's opinions about some subject matters or issues is important for organizational decision making in general. For instance, organizations are keen on retrieving and analyzing customers' opinions about products and services so as to develop more effective business strategies for product design and customer centric marketing. Nevertheless, identifying opinion sources, extracting prominent topic features, summarizing relevant opinions, and effectively predicting the polarity of an opinion are all very challenging tasks. These open research problems are the primary focuses of this Topic Feature Discovery and Opinion Mining Workshop.

Topic feature discovery aims to identify on-topic information sources and extract relevant features for a given topic (e.g., a person, an event, or a government policy). The results of many empirical experiments suggested that the effectiveness of traditional text mining methods might be hindered when they were applied to topic feature discovery from opinionated sources. This might be caused by the nature of different problems being tackled, and/or by the inappropriate effectiveness measures borrowed from classical data mining research. For instance, the widely used measures such as support and confidence, turn out to be unsuitable for the leveraging stage. By way of illustration, given a specified topic, usually a highly frequent pattern (normally short in length) is general in semantics and a specific pattern is long in length and low in frequency. The objective of research on topic feature discovery is to design and develop effective and efficient methods to extract subset of features from textual document to describe the specific topics or opinion holders.

Opinion mining, also known as sentiment analysis, aims to summarize and classify opinionated expressions. When compared with traditional fact-based text analysis, research on opinion mining tries to address the new problems related to the identification and analysis of opinions about some topics or facts. More specifically, opinion mining techniques have been applied to predicting the polarity (or inclination) of an opinionated expression related to a topic (i.e., an opinion holder). They have also been applied to consolidating and summarizing the possibly contradictory opinions from a large number of electronic documents such as blogs, online news, consumer comments that contain opinionated expressions. The fundamental problems in opinion mining research include the retrieval of opinionated expressions, identification of opinion holders or the specific features of the opinion holders, classification of the polarities of sentiments related to some opinion holders, fine-grained analysis of feature-based sentiments, detection of opinion spam, and application of opinion analysis to real-world problem solving or decision making. As a matter of fact, there are many opportunities and challenges for extensive research in the field of opinion mining.

Being inter-related, topic feature discovery and opinion mining are highly challenging topics in modern information analysis, from both an empirical and a theoretical perspective. They are also the important issues and the critical steps for Web personalization applications and recommender systems. The research problems related to these two topics have attracted increasingly more attention from researchers in the communities of data mining, Web intelligence, text mining, machine learning, natural language processing, and information retrieval. By highly focusing on these two challenging research topics and their related areas, this workshop aims to advance the theories and techniques for text mining in general and opinion mining in particular, and to explore novel methodologies for the discovery and interpretation of useful and interesting knowledge embedded in textual documents.

Topics include, but are not limited to:

Relevant feature discovery
Opinion mining and sentiment analysis
Multilingual opinion summarization
Sentiment and subjectivity classification
Feature-based sentiment analysis
Information filtering and retrieval
Text mining
Text categorizations
Ontology mining and ontology merging
Information extraction
Recommender systems
Web personalization and opinion analysis
Evaluation methodologies for topic feature discovery and opinion mining
Industrial applications of topic feature discovery and sentiment analysis

Contact:

Yuefeng Li (y2.li AT qut.edu.au)
Xiaohui (Daniel) Tao (x.tao AT qut.edu.au)

CFP: Fourth Workshop on Analytics for Noisy Unstructured Text Data (AND)

Fourth Workshop on Analytics for Noisy Unstructured Text Data (AND)
in conjunction with 19th ACM International Conference on Information and Knowledge Management (CIKM)
October 26th, 2010, Toronto, Canada

Noisy unstructured text data is ubiquitous and abundant in real-world situations. Handling noisy text poses new challenges for Information Extraction (IE), Natural Language Processing (NLP), Information Retrieval (IR) and Knowledge Management (KM). Special handling of noise as well as noise-robust IR and KM techniques are essential to overcome these challenges. As in the case of AND 07, 08 and 09, we intend that AND 2010 will provide researchers an opportunity to present their latest results toward addressing these challenges. We seek papers dealing with all aspects of noisy unstructured text data and its processing.

Topics of Interest (not limited to):

Methods for detecting and correcting errors in noisy text,
Information Retrieval from noisy text data,
Machine learning techniques for information extraction from noisy text,
Rule-based approaches for handling noisy text
Social network analysis involving noisy data
Crowd-sourcing methods for dealing with noisy data
Knowledge Management of noisy text data,
Automatic classification and clustering of noisy unstructured text data,
Noise-invariant document summarization techniques,
Text analysis techniques for analysis and mining of on-line communication texts such as transcribed calls, web logs, chat logs, tweets, microblogs, facebook posts, and email exchanges,
Business Intelligence (BI) applications dealing with noisy text data,
Document Representation and Content Analysis of noisy text documents
Interplay between linguistic complexity and uncertainty characterizing noisy text data in downstream applications,
Formal theory on characterization of noise,
Genre recognition based on the type of noise,
Characterizing, modeling and accounting for historical language change,
Surveys relating to noisy text analytics

Important Dates

Abstract Submission: July 16th, 2010
Paper Submission: July 20th, 2010
Notification of Acceptance: August 8th, 2010
Camera-Ready papers due: August 15th, 2010

17.5.10

I hate the (increasingly intelligent) Skype spam

I hate spam. I specially hate spam at Skype.

This (nigerian scam) spam seems canned text with placeholders for names of people (taking, in my case, my first family name: Gomez):

Dear Gomez,

I have been in search of someone with this last name "Gomez", so when I saw your name I was pushed to contact you and see how best we can assist each other. I am Mr. Kojo William, i am the regional manager of UNITED BANK OF AFRICA GHANA (U.B.A) . I believe it is the wish of God for me to come across you on search now. I am having an important business discussion I wish to share with you which I believe will interest you because, it is in connection with your last name and you are going to benefit from it.

One Late Shafi Gomez,A citizen of your country had a fixed deposit with my bank in 2004 for 36 calendar months, valued at US$8,400,000.00 (Eight Million, Four Hundred Thousand U,S Dollars) the due date for this deposit contract was this 16 of January 2007. Sadly Shafi was among the death victims in the May 26 2006 Earthquake disaster in Jawa, Indonesia that killed over 5,000 people. He was in Indonesia on a business trip and that was how he met his end.

My bank management is yet to know about his death, I knew about it because he was my friend and I am his account officer. Shafi did not mention any Next of Kin/ Heir when the account was opened, and he Shafi was not married and no children. Last week my Bank Management requested that should give instructions on what to do about his funds, if to renew the contract.

I know this will happen and that is why I have been looking for a means to handle the situation, because if my Bank Directors happens to know that Shafi is dead and do not have any Heir, they will take the funds for their personal use, so I don't want such to happen. That was why when I saw your last name I was happy and I am now seeking your co-operation to present you as Next of Kin/ Heir to the account, since you have the same last name with him and my bank head quarters will release the account to you. There
is no risk involved; the transaction will be executed under a legitimate arrangement that will protect you from any breach of law.

It is better that we claim the money, than allowing the Bank Directors to take it, they are rich already. I am not a greedy person, so I am suggesting we share the funds equal, 50/50% to both parties, my share will assist me to start my own company which has been my dream. Let me know your mind on this and please do treat this information as TOP SECRET. We shall go over the details once I receive your urgent response strictly through my personal email address: dr........@gmail.com

We can as well discuss this on phone; let me know when you will be available to speak with me on phone. Have a nice day and God bless you.

I will look forward hearing from you with all respect as soon as you receive my vital message..

Regards

Mr. Kojo William

Community Services

Hard work, but needed:

Reviewing submissions for the First Spanish Conference on Information Retrieval.
Reviewing a submission for ACM Transactions on Knowledge Discovery from Data.
Reviewing a submission for the Information Access and Retrieval section of the ATI Journal Novática.
Reviewing submissions for the International Journal of Electronic Commerce Specia Issue on Mining Social Media.

14.5.10

Outlier Detection Techniques Tutorials by Kriegel, Kröger and Arthur Zimek

On thing that is fascinating about using Machine Learning or Knowledge Discovery in Databases, especially in text problems, is that with enough training data, surprisingly simple algorithms can achive excellent results. However, when you want to go further, that small points in the scale that can make your system being usable or completely unuseful, you often find outliers. Outliers are defined by Hawkins:

An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism. (Hawkins, D. 1980. Identification of Outliers. Chapman and Hall)

So I have put this topic in my research agenda, long ago, and I have recently happen to find an interesting tutorial by Hans-Peter Kriegel, Peer Kröger, and Arthur Zimek about the topic:

Hans-Peter Kriegel, Peer Kröger, Arthur Zimek: Outlier Detection Techniques. Tutorial at 10th SIAM International Conference on Data Mining (SDM 2010), Columbus, Ohio, 2010. [ abstract | slides (pdf) ]

For those interested on attending to it alive, there is the oportunity at KDD 2010.

CFP: Workshop on Language Technology applied to biomedical documents

Workshop on Language Technology applied to biomedical documents
SEPLN 2010 satellite workshop

In the last decade, language technology has received an increasing interest as suitable solution to retrieval and analyse the huge volume of published documents in biological domain. Recently, medical domain also benefit from the application of such technology. The workshop is intended to provide a forum for discussing the latest advances of language technology applied to biological and medical domains. The workshop aims to provide a broad view on the shortcomings of current existing techniques, tools, or resources as well as emergent applications concerning accessing scientific publications and health general interest documents with special attention to non English documents.

Authors are invited to submit original papers addressing any of the following key topics but not limited to:

Text mining from clinical documents
Integration of biomedical resources in specific applications
Woks on minor languages and/or different from English
Real world applications (IR systems for medical and scientific specialists, medical education, health knowledge organization, ERH...)
Evaluation methodologies
Biomedical corpus development
Information Retrieval in health domain
Classification of clinical and biological documents (for instance, ICD-10)
Biomedical Named Entity Recognition and Concept Identification
Information Extraction from biological and clinical documents
Anonymisation of clinical texts
Information fusion: integrating data from heterogeneous biomedical sources, connecting resources
Methods of creating, reviewing and editing scientific content
Summarization of electronic patient records, medical reports, scientific articles, etc.
Creation of biomedical annotated corpora
Creation and evaluation of linguistic tools for biomedical domain in different languages
Evaluation methodologies in biomedical domain: system-oriented and user-oriented evaluations

Important dates

Paper submission: June 17th
Notification of acceptance for papers: July 2nd
Final Camera Ready paper due: July 15th
Worshop day: September, 6 or 7th 2010

11.5.10

CFPs on Sentiment Analysis and Opinion Mining

Upcoming workshops on these topics:

First Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2010)
Lisbon, Portugal
August 16, 2010
Submission deadline: May 7, 2010

Special session on Sentiment Analysis and Opinion Mining, DyNaK workshop at ECML-PKDD'10
DyNaK workshop at ECML-PKDD'10
Barcelona, Spain
September 24, 2010
Submission deadline: June 21, 2010

7.5.10

CFP: International Workshop on the Practical Use of Recommender Systems, Algorithms and Technologies (PRSAT 2010)

International Workshop on the Practical Use of Recommender Systems, Algorithms and Technologies (PRSAT 2010)
30 September 2010 | Barcelona, Spain
In conjunction with the 4th ACM Conference on Recommender Systems (RecSys 2010)

Dates

Paper submission: July 1, 2010
Notification of acceptance/rejection: July 26, 2010
Camera-ready copies of accepted papers: August 16, 2010
Workshop: September 30, 2010

Motivation

User modeling, adaptation, and personalization techniques have hit the mainstream. The explosion of social network websites, on-line user-generated content platforms, and the tremendous growth in computational power of mobile devices are generating incredibly large amounts of user data, and an increasing desire of users to "personalize" (their desktop, e-mail, news site, phone). The potential value of personalization has become clear both as a commodity for the benefit or enjoyment of end-users, and as an enabler of new or better services - a strategic opportunity to enhance and expand businesses. An exciting characteristic of recommender systems is that they draw the interest of industry and businesses while posing very interesting research and scientific challenges.

In spite of significant progress in the research community, and industry efforts to bring the benefits of new techniques to end-users, there are still important gaps that make personalization and adaptation difficult for users. Research activities still often focus on narrow problems, such as incremental accuracy improvements of current techniques, sometimes with ideal hypotheses, or tend to overspecialize on a few applicative problems (typically TV or movie recommenders - sometimes simply because of the availability of data). This restrains de facto the range of other applications where personalization technologies might be useful as well.

Thus, we may have reached a good point to take a step back to seek perspective in the research done in recommender systems. This workshop contrives for a new uptake on past experiences and lessons learned. We propose an analytic outlook on new research directions, or ones that still require substantial research, with a special focus on their practical adoption in working applications, and the barriers to be met in this path.

This workshop aims at bringing the gap between academic researchers and industry practitioners in the area of Recommender Systems. We are interested both in research work that faces real industry problems, and in industry cases that create research challenges.

Topics of interest

This workshop is an opportunity to bring together researchers and practitioners to discuss, on one hand, the main lessons drawn from successes but also from failures of recommender systems, and on the other hand, identify and analyze the major research areas in recommendation and personalization technologies that should be addressed in the future for a practical, effective take-up of the needs of vendors, consumers, and technology providers.

Thus, topics of interest include, but are not limited to:

Limits of recommender systems
- main bottlenecks, research dead ends and myths in recommender systems
- missing technology pieces for wider adoption
- social (privacy, culture) issues
Analytical view of personalization experiences
- case studies of recommender system implementations & deployments
- evaluation and user studies of recommender systems
- scalability in large recommender systems
- lessons learnt from your past experience
- obstacles to massive deployment of recommendation solutions in industrial environments
Recommendation in broader systems
- place of recommender systems in complete systems
- killer application area
Next needs in recommender systems
- new business models related to recommendation
- social and cultural impact of recommender systems
- new paradigms to provide recommendations
- new areas for recommendations
- users' expectations about future recommender systems
- beyond one-shot recommendations: recommendations of sequences, goal-oriented recommendations, ...

CFP: Practical Aspects of Knowledge Management: PAKM 2010

Practical Aspects of Knowledge Management: PAKM 2010
November 10-12, 2010, Philadelphia, USA

The PAKM Conference Series offers a communication forum and meeting ground for practitioners and researchers. The conference focus is on developing and deploying advanced business solutions for the management of knowledge in organizations and other communities that can benefit from its methods.

We wish to encourage the submission of contributions that espouse an interdisciplinary approach, which will therefore be favored over one-dimensional papers.

Apart from a clear description of the real-world problems they address, papers must point out the business and/or the scientific benefits of the suggested solutions for a knowledge management task of an organization or community. Furthermore, papers should emphasize the novel aspects of the suggested approach.

Topics

Below is a not exhaustive list of topics that will be considered:

Building and maintaining knowledge inventories
- knowledge directories
- knowledge modeling
- automatic creation of meta data
- knowledge integration
Collaboration and knowledge sharing
- knowledge sharing communities
- social software
- knowledge sharing and collaboration platforms
- integration of processes across organizational boundaries
Capturing and securing knowledge
- knowledge capturing within business processes
- knowledge acquisition, lessons learned, debriefing
- organizational memories
- knowledge storage and representation
Knowledge utilization
- content-oriented retrieval
- question answering
- integration of knowledge and business processes
- graphical user interfaces for retrieving and
- visualizing knowledge
- semantic web
Developing new knowledge
- innovation management
- ontology management and development
- communities of practice
Knowledge Measurement and evaluation
- evaluation of knowledge management systems
- measuring the benefits of KM solutions
- measuring the cost of KM solutions, knowledge access
- measuring the benefits of knowledge access
- proportions and relative cost of knowledge cycles
- cost of missing knowledge
Competitive intelligence
How to store and retrieve shared knowledge
Using digital libraries
Collective intelligence and knowledge management
Knowledge fusion
Theoretical models for KM
Experimental designs to test KM tasks and approaches

Dates

Papers Due: 21-May-10
Notification of acceptance: 1-Jul-10
Camera-ready Due: 12-Aug-10