Proceedings of the 8th International TC4.2
IEEE Conference on Intelligent Transportation Systems
Vienna, Austria, September 13-16, 2005
Driver Activity Monitoring through Supervised and Unsupervised Learning
Harini Veeraraghavan Stefan Atev Nathaniel Bird Paul Schrater Nikolaos Papanikolopoulos†
Department of Computer Science and Engineering
University of Minnesota
{harini,atev,bird,schrater,npapas}@cs.umn.edu
Abstract— This paper presents two different learning meth-ods applied to the task of driver activity monitoring. The goal of the methods is to detect periods of driver activity that are not safe, such as talking on a cellular telephone, eating, or adjusting the dashboard radio system. The system presented here uses a side-mounted camera looking at a driver’s profile and utilizes the silhouette appearance obtained from skin-color segmentation for detecting the activities. The unsupervised method uses agglomerative clustering to succinctly represent driver activities throughout a sequence, while the supervised learning method uses a Bayesian eigen-image classifier to distinguish between activities. The results of the two learning methods applied to driving sequences on three different subjects are presented and extensively discussed.
I. INTRODUCTION
The goal of this project is to develop a camera-based system for monitoring the activities of automobile drivers. As in any system deployed for monitoring driver activities, the primary goal is to distinguish between safe and unsafe driving actions. An application that motivates this work is objective reporting of a driver’s activities over long driving periods, in contrast with subjective reports based on surveys. Another interesting application is in the area of interior vehicle design, where such information helps improve the placement of controls in order to reduce unsafe driving behaviors.
There is no fixed list of actions that qualify as unsafe driv¬ing behaviors. In general, an activity or an action that reduces a driver’s alertness or awareness of their surroundings should be classified as unsafe driving behavior. Some examples of unsafe driving behavior include driver fatigue, talking on a cellular telephone, eating, and adjusting the controls of the dashboard stereo while driving.
In this work, we present methods for summarizing and recognizing the activities of a driver, using the appearance of the driver’s pose as fundamental cues. The position of the hands, arms and the head vary across different activities, and vary among individual drivers. While there is a lot of work in driver activity monitoring through head and eye tracking [1], [9], [10], [17], [13], [20], there is very little work that makes use of the changes in the appearance resulting from the motion of the driver inside the automobile.
The skin-tone regions of the input video are used as the features in the classifiers. In the unsupervised method,
† Author to whom all correspondence should be sent.
binary skin-tone masks are agglomerated across an entire action sequence to assign a probability of observing skin-tones for each pixel in the image during the action. Action sequences are separated from one another by detecting substantial movements in the image, signified by large dif¬ferences between the skin-tone masks of sequential frames. In the supervised method, key frames corresponding to safe driving actions and unsafe driving actions are specified by the user. These key frames are used for obtaining the subspace densities corresponding to an individual action. In this work, talking on a cellular telephone is classified as an unsafe action. A Bayesian eigen-image method is used for classifying the activities.
This paper is organized as follows: Section II discusses the related work in this area. The unsupervised clustering method is discussed in the Section III, while the supervised, eigen-image classification method is discussed in Section IV. The results and brief discussion of the results and future work is in Section V, and Section VI presents the conclusions.
II. RELATED WORK
Most of the work on driver activity monitoring is focused on the detection of driver alertness through monitoring eyes [9], [10], [17], face, head, or facial expressions [1], [13], [20]. In order to deal with the varying illumination, methods such as [21] use infrared imaging in addition to normal cameras. Learning-based methods such as [2], [19] exist for detecting driver alertness and gaze directions. In our work, both learning methods make use of the silhouette of the subjects for detection of activity. Several silhouette-based activity recognition methods exist in the literature such as the motion history image method by [6], the W4 system by [8], and the Pfinder system by [18]. The supervised learning or the Bayesian eigen-image method is based on the face recognition work of [11]. This method basically seeks a low dimensional representation of the data for classification. Sev¬eral dimensionality reduction techniques exist, such as [3], [4], [15], and the manifold learning methods in [5], [12]. An example of an unsupervised method for learning human behaviors is presented in [14], where a maximum likelihood method is used to learn the structure of a triangulated graph of feature point-based human motions. In [7], the general segment of the body region where significant motion takes place is detected, and this information is used as a cue for matching activities.
0-7803-9215-9/05/$20.00 ©2005 IEEE. 895
III. UNSUPERVISED CLUSTERING OF DRIVING
BEHAVIORS
The most basic cue about a driver’s actions is his pose. However, tracking a driver’s articulated motion in an environ¬ment with rapidly varying illumination and many potential self-occlusions is prohibitive both in terms of computational resources (for model-based tracking) and since the initial¬ization of an articulated model is non-trivial in an automatic fashion. Our approach does not depend on an estimation of a driver’s pose, but on the observation that periods of safe driving are periods of little motion of the driver’s body. Of course, a driver does not move much while talking on a cellular telephone (an unsafe driving behavior), so the need arises to classify periods of minimal motion into safe-driving periods and unsafe-driving periods.
Detecting motion in a moving car’s interior is complicated since the illumination of the interior can change very rapidly. Furthermore, the outdoor environment is visible through the car’s windows, so motion will be always detected in the image regions corresponding to the car’s windows. To address this problem, we only detect motion of skin-like regions, for example a driver’s face and hands. This approach is advantageous since skin color detection can be fairly robust to various illumination conditions. Skin tones are also unlikely to appear in the window regions, so motion in the outside environment is unlikely to be detected. Portions of the car’s interior that are misclassified as skin are static and will contribute nothing to the detected motion, so such regions are not problematic as well.
A. Skin Color Detection
We perform the classification of color pixels into skin tones and non-skin tones by working in the normalized RGB space. The normalization is effective against varying illumination conditions, and can also be motivated by the fact that human skin tones have very similar chromatic properties regardless of race [16].
An RGB triplet (r, g, b) with values for each primary color between 0 and 255 is normalized into the triplet (r', g', b') using the relationships:
r' = 255r 255g 255b
r + g + b, g' = r + g + b, b' = r + g + b. (1)
We classify a normalized color (r', g', b') as a skin color if it lies within the region of normalized RGB space described by the following rules (found in [16]):
r'> 95, g'> 45, b'> 20
max{r', g', b'} min{r', g', b'} > 15 (2)
r' g' > 15, r' > b'
Fig. 1 shows the results of the skin color detection for various subjects and lighting conditions.
It should be noted that other skin-tone detection method can be used without affecting the rest of the algorithm. We tried using a non-parametric Bayesian skin probability map as an alternative approach, but its results were of
Fig. 1. Skin color detection (bottom row) on various images (top). Skin color is indicated in black. The results are post-processed by a sequence of morphological erosions and dilations.
unsatisfactory quality as the number of training images used to create the map was small and the images themselves were obtained under radically different lighting conditions than those during our driver monitoring experiments. However, if a better skin-color detection method is available, it can be substituted in favor for the rule-based one.
B. Detecting Changes in Behavior
Since our goal is to detect and classify relatively motion-free periods, we use inter-frame differencing to decide when a period starts and ends. If the change between two consec¬utive skin-color masks obtained by the color classification step is significant, the current low-motion period terminates. When the interframe difference drops, we start accumulating data about a new low-motion period.
Given the image region R, the change between two consecutive binary skin-color masks It-1 : R {0, 1} and It : R {0, 1} is described by the total number of pixels whose classification changed:
c(t) = ~ |It(p) It-1(p)|
pER
Whenever c(t) is large, a transition in driver behavior is detected. A global threshold cannot be used to determine whether the change c(t) is significant or not, since different low-motion actions differ in the typical amount of “natural” motion that occurs throughout the action. Additionally, the amount of noise in the skin classification masks may differ from one run of the algorithm to another. Finally, the significance of a change c(t) depends on how much of a driver’s skin is exposed. For these reasons, we chose to have a relative threshold for c(t)’s significance that depends on the observed variation in c(t) over a period of time.
Assuming that a low-motion period started at time t1, we consider the change at time tn significant if c(tn) is more than 2 standard deviations away from the mean of the changes c(tn-w), c(tn-w +1), ... c(tn-1), where w is the history window size (set to 900 frames, which corresponds to 30 seconds of past activity). Both the mean and standard deviations are computed incrementally. Since we start record¬ing data for a new action immediately during the onset of the significant change, the deviation in the first few samples (i.e. c(t1), c(t2),...) is larger, which limits the number of spurious short periods identified by the algorithm. This is
896
advantageous since the sequence of images leading to a low-motion action will contribute to the action model and thus will allow us to distinguish between otherwise similar low-motion periods based on information about the high-motion events that preceded them.
C. Action Models
The change in the binary skin tone masks indicates the need to start recording an action model. Each action model is simply a probability map that describes the expectation of observing a skin-color at every location in the input images. Given the binary skin masks It1, It2, ... Itn for a low-motion action with duration from time t1 until time tn, the probability map P is defined by:
It (3)
Fig. 2. Skin probability maps for several action clusters (representing more than 80% of the driver’s activity). Darker regions indicate higher probability of observing skin tones.
Sample probability maps for several actions are shown in Fig. 2. Individual actions that are determined to be similar are merged together into clusters. The goal of the clustering is to produce clusters that correspond to a single type of behavior (safe or unsafe). Such clustering facilitates further analysis of a driver’s activities as it reduces tremendously the amount of data that needs to be analyzed (thousands of video frames versus tens of activity models). The similarity between an action model P and an action model Q is defined as:
The measure is the Bhatacharya coefficient for two nor¬malized histograms, and ranges from 0 to 1. A high simi¬larity measure corresponds to similar action models, while a low measure corresponds to dissimilar models. A model is compared to the means of all clusters and merged with the most similar one if the similarity measure exceeds a certain threshold. Since we cluster according to the distance to the mean (rather than the mean distance), each cluster can be represented by a single action model. The models P is merged into a cluster represented by the model Q according to:
Q(i) n + m
n P(i) + Q(i), (5)
n + m
m
where n and m are the number of video frames represented
in P and Q, respectively.
IV. BAYESIAN EIGEN-IMAGE ACTIVITY CLASSIFICATION
For a side-mounted camera, the significant observable motion is the motion of the driver’s hands in the image. However, one important issue with using hand motions is the problem of self-occlusion for extended periods of time and the resulting pose ambiguity. This problem precludes the use of region-based hand-trackers to detect hand motion and position. Instead, a snapshot representative of a particular action is used as the classification feature.
A. Training Method
The goal of training is to find a representation for the images of a given class. Essentially, we want to find a low dimensional representation for the data Several methods such as the Karhunen-Lo´eve Transform, Principal Compo¬nent Analysis, and eigenimages exist for computing the low dimensional representation. We use the eigen-image method originally proposed by [15] for face recognition and extended later on by [11]. The method’s robustness to illumination variations and self-occlusions can achieved by using multiple suitable training images.
(a) Largest eigenvector for (b) Largest eigenvector for talk
drive activity activity
Fig. 3. Largest eigenvectors for drive and talk class.
For a given set of images corresponding to a given class, the largest eigenvalues and eigenvectors represent the distribution of the data along the most significant component direction. This is the basis of this method. For the given set of images, Ii1 , ... , IiK belonging to a class Ci, an eigenvalue decomposition is performed to obtain Σi, the eigenvectors for the class Ci. This operation is performed off-line for each class of images. Fig. 3 shows the second largest principal eigen-image for the talk and drive actions.
Fig. 4. Input image for training.
897
A typical image used for training is shown in Fig. 4. As shown in Fig. 4, only the skin portions of the image are chosen for training. The skin regions are detected automati¬cally using the method described earlier in Section III. This removes irrelevant portions of the image from consideration during training. Further, since only the skin portions corre¬sponding to the hands and face are significant for the two classes, any skin segments around the leg regions are masked. This step helps reduce the dimensionality of the data, thereby improving the accuracy of the training with fewer training samples.
Fig. 5. Distribution of the two classes under three largest principal
components (starting from the second highest). The safe driving class is represented by circles and the unsafe driving class is represented by crosses.
Fig. 6 shows the effect of the number of training samples on the accuracy of classification for all samples belonging to the talk action. The number of correctly classified and misclassified images for different training sample size is shown in Fig. 6. Based on 6, we chose a sample size of 40 where the number of correctly classified samples is maximized and the number of incorrectly classified samples is minimized. Fig. 7 shows the results of classification for samples containing the driving action for different sample sizes. Finally, Fig. 5, shows the distribution of the two classes along the three principal components. The largest eigenvalues and eigenvectors capture the largest variation within a class. However, given the small training data size compared to the dimensionality of the data, the errors in skin segmentation are also modeled. Hence, we take the eigenvalues and eigenvectors starting with the second highest principal component for classification.
B. Activity Classification
The activity in each frame is evaluated by computing its similarity with the set of training images for each activity class. A probabilistic measure of similarity is used instead of the usual Euclidean metric. Given a candidate image Ix, its similarity to an image Iij from class Ci is computed by projecting the difference of the two images, µ = Ix Iij
Fig. 6. The results of classification for the training images of the unsafe driving class.
Fig. 7. Classification results for the safe driving class on the training set.
onto the principal eigenvectors of class Ci. This can be represented as,
π)d/2|Ei |1/2 , (6)
where Ei contains the largest eigenvectors for class Ci and d is the dimensionality of the data. This operation is repeated over all member images of a class until a maximum score is found. For recognizing the activity, this operation needs to be performed over all the training images in all the classes. This computation can be very expensive as the number of classes and the number of images increases. To reduce the computational burden, an off-line whitening transformation is performed as described in [11]. Each of the Ii1, ... , IiK images in class Ci are transformed using the eigenvalues and eigenvectors:
imji = D1
i SiIj
2 i , (7)
898
where Di and Si are the eigenvalues and eigenvectors computed for the class Ci . Given these pre-computed transformations , the match for a new image Ix is computed as:
P(µ|Ci) =
e 2 Ilimximj 1 i II2
(2 π)d/2|Σi|1/2 , (8)
where imx is the transformed image of Ix computed from the eigenvectors and eigenvalues of Ci as in Equation (7). The activity in a frame is classified as safe driving or talking based on the relative values of P(µ|Driving) and P(µ|Talking). Activities having almost equal probabilities for both classes, are rejected and not classified as belonging to either class. That occurs when the probability of associa¬tion is in the range from 0.45 to 0.55.
V. RESULTS AND DISCUSSION
A. Experimental Setup
Test data for the methods is comprised of three exam-ple videos of individuals pretending to drive a stationary automobile. The video camera used to record the videos was placed on a tripod directly outside the passenger-side window viewing the driver in profile. Each video features a different individual sitting in the car pretending to drive; different ethnicities and genders are represented. The lighting conditions vary throughout the videos as the car was in an outdoor parking lot. Each of the three videos is about six minutes long (between 10,500 and 11,000 frames), full-color, and at full 720 × 480 resolution.
During the course of each video, the driver goes through periods of driving normally and performing distracting ac¬tions. Distracting actions include talking on a cellular tele¬phone, adjusting the controls of the dashboard radio, and drinking from a soda can. These actions were chosen as the unsafe behaviors to test for because they are very common.
B. Activity Clustering
The goal of the clustering method is to produce as few activity clusters as possible, while not merging together safe and unsafe activities. If safe and unsafe activities are merged together, the subsequent classification of clusters into safe and unsafe activities will introduce errors. If too many clusters are created, the method would have failed its goal to summarize a driver’s activities. We tested the method using different settings for the similarity threshold. Table I shows the performance of the clustering using a threshold value of 0.85. The total number of clusters corresponds to the number of distinct activities recognized by the method. Singleton clusters are clusters that contain only one action model — such clusters usually reflect short periods of high motion indicative of transitions between different actions.
Each sequence was manually segmented into safe driving periods and unsafe driving periods. Since the goal of the clustering method is to group activities for further analysis, it must not group together activities from the two different classes. The proportion of incorrectly merged frames is indicated in the last column of table I.
Subject Frames Clusters (Singletons) Confusion
1 10231 33 (24) 4.54%
2 10110 38 (23) 16.8%
3 10380 16 (8) 11.1%
TABLE I
UNSUPERVISED METHOD RESULTS
The majority of the incorrectly clustered action models represent failures of the skin-color segmentation. For sub¬ject 1, the forearm was not segmented properly on several occasions. Subject 2 has no experience driving and was constantly in motion during the whole sequence. The head pose for subject 2 varied significantly during both safe driving and unsafe driving periods, which contributes to the higher confusion. Finally, subject 3’s results suffer from under-segmentation, but improve significantly if the similar¬ity threshold is increased. We suspect that this is due to the fact that subject 3’s skin-color masks had fewer skin pixels as compared to the other subjects, primarily due to skin segmentation failures. Our future work on the unsupervised method will concentrate on making the similarity threshold relative rather than absolute and on improving the skin-color segmentation further.
C. Activity Classification
The supervised Bayesian eigen-image method was tested on the same subjects and sequences as the unsupervised method. Training images were free of noise in segmentation, and irrelevant parts of the scene were masked out. The test sets were used as is. Some of the training images had the leg portions masked out for improving the accuracy of the training. Training images were excluded from the test sequences. The number of training images was 20 for the safe driving, and 40 for the unsafe driving classes. A total of 963 test images were used.
The system correctly classified 95.84% of the safe driving activity, and 73.91% of the unsafe driving activity frames. 1.6% of samples in the safe driving activity class were misclassified as unsafe activity and 14.35% of samples in the unsafe activity were misclassified as a safe driving activity. 11.74% of samples in unsafe activity were detected in both classes, as were 3.16% of the samples in the drive activity. The main causes of misclassification were:
• Noise in the segmentation of the test frames.
• Ambiguous posture of the subjects in either class.
(a) Bad skin-tone segmentation (b) Pose ambiguity due to self
occlusion
Fig. 8. Bad segmentation and self-occlusions can affect the accuracy of classification.
899
Noise in segmentation of the skin portions of drivers resulted mostly from extreme saturation of the color image due to very bright illumination and in some cases, the coloration of the driver’s clothing. An example of a poorly segmented image is shown in Fig. 8 where the subject’s hands were under-segmented resulting in poor classification. Another source of misclassification was the result of ambiguous posture of the driver. For example, a driver leaning too close to the window was misclassified. Another case was when only one of the hands was visible due to self occlusion. In this case, the safe driving activity was confused with the unsafe activity where only one hand is in contact with the steering wheel. An example is shown in Fig. 8 where the system detected the driver to be in either safe or unsafe driving states.
While the supervised learning method obtains high clas-sification accuracy, its main drawback is that it is unsuitable for real-time driver activity classification. The supervised learning method was trained using two distinct classes and across different subjects to account for the variability in the appearance of the hands and arms with subjects. However, training for only two classes limits the performance of the system when applied to detect activities such as adjusting the dashboard radio controls.
In this work, our main focus was in distinguishing safe versus unsafe driving activities in general. One extension would be to detect different subsets under each class of activities, in particular the unsafe driving class. Instead of using only one camera and the appearance cue, we would like to extend this work to using multiple cues obtained from multiple cameras, such as eye gaze and head motion.
Although the two learning methods are employed sep-arately, one extension would be to use the two methods together. In other words, the supervised learning method can be used to classify the clusters generated my the un-supervised method. This would allow the system to collect information about a driver’s activities in an online fashion, since individual clusters are produced or updated only when a change in behavior is detected.
VI. CONCLUSIONS
We have presented two different methods for monitoring driving activities under challenging imaging conditions. The results obtained validate the advantages of using driver appearance obtained from skin-color segmentation for clas¬sification and clustering purposes. Specific advantages of this approach are the increased robustness to illumination variations and elimination of the need for tracking and pose determination.
REFERENCES
[1] S. Baker, I. Matthews, J. Xiao, R. Gross, T. Kanade, and T. Ishikawa. Real-time non-rigid driver head tracking for driver mental state esti-mation. In 11th World Congress on Intelligent Transportation Systems, October 2004.
[2] S. Baluja and D. Pomerleau. Non-instrusive gaze tracking using arti-ficial neural networks. Technical Report CMU-CS-94-102, Carnegie Mellon University, 1994.
[3] M.S. Bartlett, J. R. Movellan, and T. J. Sejnowski. Face recognition by independant component analysis. IEEE Transactions on Neural Networks, 13(6):1450–1464, Nov 2002.
[4] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman. Eigenfaces vs fischerfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):711–720, July 1997.
[5] M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373– 1396, 2003.
[6] A. Bobick and J. Davis. The representation and recognition of action using temporal activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(3):257–267, 2001.
[7] J. Gao, R. T. Collins, A. G. Hauptmann, and H. D. Wactlar. Ar¬ticulated motion modeling for activity analysis. In IEEE Workshop on Articulated and Nonrigid Motion, held in conjunction with CVPR 2004, 2004.
[8] I. Haritaoglu, D. Harwood, and L.S. Davis. W4: real-time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):809–830, August 2000.
[9] Q. Jia and X. Yang. Real-time eye, gaze, and face pose tracking for monitoring driver vigilance. Real Time Imaging, 8(5):357–377, Oct 2002.
[10] C. Jiangwei, J. Linsheng, G. Lie, G. keyou, and W. Rongben. Driver’s eye state detecting method design based on eye geometry feature. In Intelligent Vehicles Symposium, pages 357–362, June 2004.
[11] B. Moghaddam, T. Jebara, and A. Pentland. Bayesian face recognition. Pattern Recognition, 22(11):1771–1782, Nov 2000.
[12] S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290:2323–2326, Dec 2000.
[13] P. Smith, M. Shah, and N. da Vitoria Lobo. Eye and head tracking based methods-determining driver visual attention with one camera. IEEE Transactions on Intelligent Transportation Systems, 4(4):205– 218, Dec 2003.
[14] Y. Song, L. Goncalves, and P. Perona. Learning probabilistic structure for human motion detection. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, volume 2, pages 771–777, December 2001.
[15] M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 1991.
[16] V. Vezhnevets, V. Sazonov, and A. Andreeva. A survey on pixel-based skin color detection techniques. In Proc. Graphicon’03, pages 85–92, September 2003.
[17] E. Wahlstrom, O. Masoud, and N. Papanikolopoulos. Vision-based methods for driver monitoring. In IEEE Intelligent Transportation Systems Conf., volume 2, pages 903–908, Oct 2003.
[18] C. R. Wren, A. Azarbayejani, T. Darrell, and A. Pentland. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):780–785, 1997.
[19] X.Liu, F. Xu, and K. Fujimura. Real-time eye detection and tracking for driver observation under various light conditions. In IEEE Intelligent Vehicles Symposium, June 2002.
[20] Y. Zhu and K. Fujimura. Head pose estimation for driver monitoring. In Intelligent Vehicles Symposium, pages 501–506, June 2004.
[21] Zhiwei Zhu, Kikuo Fujimura, and Qiang Ji. Real-time eye detection and tracking under various light conditions. In ETRA ’02: Proceedings of the Symposium on Eye Tracking Research & Applications, pages 139–144. ACM Press, 2002.
VII. ACKNOWLEDGMENTS
This work was supported by the Minnesota Department of Transportation, the ITS Institute at the University of Minnesota, and the National Science Foundation through grant #IIS-0219863.
900
Learning to Capitalize with Character-Level Recurrent Neural Networks:
An Empirical Study
Raymond Hendy Susanto and Hai Leong Chieu and Wei Lu
Singapore University of Technology and Design
DSO National Laboratories
raymond susanto,luwei@sutd.edu.sg
chaileon@dso.org.sg
Abstract
In this paper, we investigate case restoration for text without case information. Previous such work operates at the word level. We pro¬pose an approach using character-level recur¬rent neural networks (RNN), which performs competitively compared to language model¬ing and conditional random fields (CRF) ap¬proaches. We further provide quantitative and qualitative analysis on how RNN helps im¬prove truecasing.
1 Introduction
Natural language texts (e.g., automatic speech tran-scripts or social media data) often come in non-standard forms, and normalization would typically improve the performance of downstream natural lan¬guage processing (NLP) applications. This paper in¬vestigates a particular sub-task in text normalization: case restoration or truecasing. Truecasing refers to the task of restoring case information (uppercase or lowercase) of characters in a text corpus. Case infor¬mation is important for certain NLP tasks. For ex¬ample, Chieu and Ng (2002) used unlabeled mixed case text to improve named entity recognition (NER) on uppercase text.
The task often presents ambiguity: consider the word “apple” in the sentences “he bought an apple” and “he works at apple”. While the former refers to a fruit (hence, it should be in lowercase), the latter refers to a company name (hence, it should be cap-italized). Moreover, we often need to recover the case information for words that are previously un-seen by the system.
In this paper, we propose the use of character-level recurrent neural networks for truecasing. Pre-vious approaches for truecasing are based on word level approaches which assign to each word one of the following labels: all lowercase, all upper¬case, initial capital, and mixed case. For mixed case words, an additional effort has to be made to decipher exactly how the case is mixed (e.g., MacKenzie). In our approach, we propose a gen-erative, character-based recurrent neural network (RNN) model, allowing us to predict exactly how cases are mixed in such words.
Our main contributions are: (i) we show that character-level approaches are viable compared to word-level approaches, (ii) we show that character-level RNN has a competitive performance compared to character-level CRF, and (iii) we provide our quantitative and qualitative analysis on how RNN helps improve truecasing.
2 Related Work
Word-based truecasing The most widely used approach works at the word level. The simplest ap-proach converts each word to its most frequently seen form in the training data. One popular ap-proach uses HMM-based tagging with an N-gram language model, such as in (Lita et al., 2003; Nebhi et al., 2015). Others used a discriminative tagger, such as MEMM (Chelba and Acero, 2006) or CRF (Wang et al., 2006). Another approach uses statisti-cal machine translation to translate uncased text into a cased one. Interestingly, no previous work oper¬ated at the character level. Nebhi et al. (2015) in¬vestigated truecasing in tweets, where truecased cor
pora are less available.
Recurrent neural networks Recent years have shown a resurgence of interest in RNN, particularly variants with long short-term memory (Hochreiter and Schmidhuber, 1997) or gated recurrent units (Cho et al., 2014). RNN has shown an impressive performance in various NLP tasks, such as machine translation (Cho et al., 2014; Luong et al., 2015), language modeling (Mikolov et al., 2010; Kim et al., 2016), and constituency parsing (Vinyals et al., 2015). Nonetheless, understanding the mechanism behind the successful applications of RNN is rarely studied. In this work, we take a closer look at our trained model to interpret its internal mechanism.
3 The Truecasing Systems
In this section, we describe the truecasing systems that we develop for our empirical study.
3.1 Word-Level Approach
A word-level approach truecases one word at a time. The first system is a tagger based on HMM (Stol-cke, 2002) that translates an uncased sequence of words to a corresponding cased sequence. An N-gram language model trained on a cased corpus is used for scoring candidate sequences. For decoding, the Viterbi algorithm (Rabiner, 1989) computes the highest scoring sequence.
The second approach is a discriminative classifier based on linear chain CRF (Lafferty et al., 2001). In this approach, truecasing is treated as a sequence labeling task, labelling each word with one of the following labels: all lowercase, all uppercase, initial capital, and mixed case. For our experiments, we used the truecaser in Stanford’s NLP pipeline (Man¬ning et al., 2014). Their model includes a rich set of features (Finkel et al., 2005), such as surrounding words, character N-grams, word shape, etc.
Dealing with mixed case Both approaches re-quire a separate treatment for mixed case words. In particular, we need a gazetteer that maps each word to its mixed case form – either manually cre¬ated or statistically collected from training data. The character-level approach is motivated by this: In-stead of treating them as a special case, we train our model to capitalize a word character by character.
3.2 Character-Level Approach
A character-level approach converts each character to either uppercase or lowercase. In this approach, mixed case forms are naturally taken care of, and moreover, such models would generalize better to unseen words. Our third system is a linear chain CRF that makes character-level predictions. Simi¬lar to the word-based CRF, it includes surrounding words and character N-grams as features.
Finally, we propose a character-level approach us¬ing an RNN language model. RNN is particularly useful for modeling sequential data. At each time step t, it takes an input vector xt and previous hid-den state ht1, and produces the next hidden state ht. Different recurrence formulations lead to differ¬ent RNN models, which we will describe below.
Long short-term memory (LSTM) is an archi-tecture proposed by Hochreiter and Schmidhuber (1997). It augments an RNN with a memory cell vector ct in order to address learning long range dependencies. The content of the memory cell is updated additively, mitigating the vanishing gradi-ent problem in vanilla RNNs (Bengio et al., 1994). Read, write, and reset operations to the memory cell are controlled by input gate i, output gate o, and for¬get gate f. The hidden state is computed as:
σ(Wiht1 + Uixt) (1)
σ(Woht1 + Uoxt) (2)
σ (Wf ht1 + Uf xt) (3)
tanh(Wght1 + Ugxt) (4)
ft O ct1 + it O gt (5)
ot O tanh(ct) (6)
where σ and tanh are element-wise sigmoid and hy-perbolic tangent functions, and Wj and Uj are pa-rameters of the LSTM for j E {i, o, f, g}.
Gated recurrent unit (GRU) is a gating mech-anism in RNN that was introduced by Cho et al. (2014). They proposed a hidden state computation with reset and update gates, resulting in a simpler LSTM variant:
rt = σ(Wrht1 + Urxt) (7)
zt = σ(Wzht1 + Uzxt) (8)
eht = tanh(Wh(rt O ht1) + Uhxt) (9)
ht = (1 - zt) O ht1 + zt O eht (10)
EN-Wikipedia EN-WSJ EN-Reuters DE-ECI
Acc. P R F1 Acc. P R F1 Acc. P R F1 Acc. P R F1
Word-based Approach
LM (N = 3)
LM (N = 5) 94.94
94.93 89.34 84.61
89.42 84.41 86.91
86.84 95.59
95.62 91.56
91.72 78.79
78.79 84.70
84.77 94.57
94.66 93.49 79.43
93.92 79.47 85.89
86.09 95.67
95.68 97.84 87.74
97.91 87.70 92.51
92.53
CRF-WORD 96.60 94.96 87.16 90.89 97.64 93.12 90.41 91.75 96.58 93.91 87.19 90.42 96.09 98.41 88.73 93.32
Chelba and Acero (2006) n/a 97.10 - - - n/a n/a
Character-based Approach
CRF-CHAR 96.99 94.60 89.27 91.86 97.00 94.17 84.46 89.05 97.06 94.63 89.12 91.80 98.26 96.95 96.59 96.77
LSTM-SMALL 96.95 93.05 90.59 91.80 97.83 93.99 90.92 92.43 97.37 93.08 92.63 92.86 98.70 97.52 97.39 97.46
LSTM-LARGE 97.41 93.72 92.67 93.19 97.72 93.41 90.56 91.96 97.76 94.08 93.50 93.79 99.00 98.04 97.98 98.01
GRU-SMALL 96.46 92.10 89.10 90.58 97.36 92.28 88.60 90.40 97.01 92.85 90.84 91.83 98.51 97.15 96.96 97.06
GRU-LARGE 96.95 92.75 90.93 91.83 97.27 90.86 90.20 90.52 97.12 92.02 92.07 92.05 98.35 96.86 96.79 96.82
Table 2: Truecasing performance in terms of precision (P), recall (R), and F1. All improvements of the best performing character-based systems (bold) over the best performing word-based systems (underlined) are statistically significant using sign test (p < 0.01). All improvements of the best performing RNN systems (italicized) over CRF-CHAR are statistically significant using sign test (p < 0.01).
(a) Samples from DE-ECI
Figure 1: Cells that are sensitive to lowercased and capitalized words. Text color represents activations (1 < tanh(c) < 1): positive is blue, negative is red. Darker color corresponds to greater magnitude.
4.2 Results
Table 2 shows the experiment results in terms of pre¬cision, recall, and F1. Most previous work did not evaluate their approaches on the same dataset. We compare our work to (Chelba and Acero, 2006) us¬ing the same WSJ sections for training and evalua¬tion on 2M word training data. Chelba and Acero only reported error rate, and all our RNN and CRF approaches outperform their results in terms of error rate.
First, the word-based CRF approach gives up to 8% relative F1 increase over the LM approach. Other than WSJ, moving to character level further improves CRF by 1.1-3.7%, most notably on the German dataset. Long compound nouns are com-mon in the German language, which generates many out-of-vocabulary words. Thus, we hypothesize that character-based approach improves generalization. Finally, the best F1 score for each dataset is achieved by the RNN variants: 93.19% on EN-Wiki, 92.43% on EN-WSJ, 93.79% on EN-Reuters, and 98.01% on DE-ECI.
We highlight that different features are used in CRF-WORD and CRF-CHAR. CRF-CHAR only includes simple features, namely character and word N-grams and sentence boundary indicators. In con¬trast, CRF-WORD contains a richer feature set that is predefined in Stanford’s truecaser. For instance, it includes word shape, in addition to neighboring words and character N-grams. It also includes more feature combinations, such as the concatenation of the word shape, current label, and previous label. Nonetheless, CRF-CHAR generally performs better than CRF-WORD. Potentially, CRF-CHAR can be improved further by using larger N-grams. The de¬cision to use simple features is for optimizing the training speed. Consequently, we are able to dedi¬cate more time for tuning the regularization weight.
Training a larger RNN model generally improves performance, but it is not always the case due to possible overfitting. LSTM seems to work better than GRU in this task. The GRU models have 25% less parameters. In terms of training time, it took 12 hours to train the largest RNN model on a sin¬gle Titan X GPU. For comparison, the longest train¬ing time for a single CRF-CHAR model is 16 hours. Training LM and CRF-WORD is much faster: 30 seconds and 5.5 hours, respectively, so there is a speed-accuracy trade-off.
5 Analysis
5.1 Visualizing LSTM Cells
An interesting component of LSTM is its memory cells, which is supposed to store long range depen-dency information. Many of these memory cells are not human-interpretable, but after introspecting our trained model, we find a few memory cells that are sensitive to case information. In Figure 1, we plot the memory cell activations at each time step (i.e., tanh(ct)). We can see that these cells activate differ-ently depending on the case information of a word (towards -1 for uppercase and +1 for lowercase).
5.2 Case Category and OOV Performance
Corpus Lower Cap. Upper Mixed OOV
EN-Wiki 79.91 18.67 0.91 0.51 2.40
EN-WSJ 84.28 13.06 2.63 0.03 3.11
EN-Reuters 78.36 19.80 1.53 0.31 5.37
DE-ECI 68.62 29.15 1.02 1.21 4.01
Table 3: Percentage distribution of the case categories and OOV words
In this section, we analyze the system perfor-mance on each case category. First, we report the percentage distribution of the case categories in each test set in Table 3. For both languages, the most fre¬quent case category is lowercase, followed by capi¬talization, which generally applies to the first word
EN-Wiki EN-WSJ EN-Reuters DE-ECI EN-Wiki EN-WSJ EN-Reuters DE-ECI
(b) Capitalized
EN-Wiki EN-WSJ EN-Reuters DE-ECI EN-Wiki EN-WSJ EN-Reuters DE-ECI
(c) Uppercase (d) OOV
Figure 2: Accuracy on mixed case (a), capitalized (b), uppercase (c), and OOV words (d).
in the sentence and proper nouns. The uppercase form, which is often found in abbreviations, occurs more frequently than mixed case for English, but the other way around for German.
Figure 2 (a) shows system accuracy on mixed case words. We choose the best performing LM and RNN for each dataset. Character-based ap¬proaches have a better performance on mixed case words than word-based approaches, and RNN gen¬erally performs better than CRF. In CRF-WORD, surface forms are generated after label prediction. This is more rigid compared to LM, where the sur¬face forms are considered during decoding.
In addition, we report system accuracy on capi-talized words (first letter uppercase) and uppercase words in Figure 2 (b) and (c), respectively. RNN performs the best on capitalized words. On the other hand, CRF-WORD performs the best on uppercase. We believe this is related to the rare occurrences of uppercase words during training, as shown in Ta¬ble 3. Although mixed case occurs more rarely in general, there are important clues, such as charac¬ter prefix. CRF-CHAR and RNN have comparable performance on uppercase. For instance, there are only 2 uppercase words in WSJ that were predicted
differently between CRF-CHAR and RNN. All sys-tems perform equally well (99% accuracy) on low-ercase. Overall, RNN has the best performance.
Last, we present results on out-of-vocabulary (OOV) words with respect to the training set. The statistics of OOV words is given in Table 3. The sys¬tem performance across datasets is reported in Fig¬ure 2 (d). We observe that RNN consistently per¬forms better than the other systems, which shows that it generalizes better to unseen words.
6 Conclusion
In this work, we conduct an empirical investiga¬tion of truecasing approaches. We have shown that character-level approaches work well for truecasing, and that RNN performs competitively compared to language modeling and CRF. Future work includes applications in informal texts, such as tweets and short messages (Muis and Lu, 2016).
Acknowledgments
We would also like to thank the anonymous review-ers for their helpful comments. This work is sup-ported by MOE Tier 1 grant SUTDT12015008.
References
Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradi-ent descent is difficult. IEEE Transactions on Neural Networks, 5(2):157–166.
Ciprian Chelba and Alex Acero. 2006. Adaptation of maximum entropy capitalizer: Little data can help a lot. Computer Speech & Language, 20(4):382–399.
Hai Leong Chieu and Hwee Tou Ng. 2002. Teaching a weaker classifier: Named entity recognition on upper case text. In Proceedings of ACL, pages 481–488.
Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase represen¬tations using RNN encoder–decoder for statistical ma¬chine translation. In Proceedings of EMNLP, pages 1724–1734.
William Coster and David Kauchak. 2011. Simple En-glish Wikipedia: A new text simplification task. In Proceedings of ACL-HLT, pages 665–669.
Jenny Rose Finkel, Trond Grenager, and Christopher Manning. 2005. Incorporating non-local informa-tion into information extraction systems by Gibbs sam¬pling. In Proceedings of ACL, pages 363–370.
Sepp Hochreiter and J¨urgen Schmidhuber. 1997. Long short-term memory. Neural Computation, 9(8):1735– 1780.
Andrej Karpathy, Justin Johnson, and Fei-Fei Li. 2016. Visualizing and understanding recurrent networks. In Proceedings of ICLR.
Yoon Kim, Yacine Jernite, David Sontag, and Alexan-der M Rush. 2016. Character-aware neural language models. In Proceedings of AAAI.
John Lafferty, Andrew McCallum, and Fernando CN Pereira. 2001. Conditional random fields: Probabilis¬tic models for segmenting and labeling sequence data. In Proceedings of ICML, pages 282–289.
Lucian Vlad Lita, Abe Ittycheriah, Salim Roukos, and Nanda Kambhatla. 2003. tRuEcasIng. In Proceed-ings of ACL, pages 152–159.
Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of EMNLP, pages 1412–1421.
Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David Mc-Closky. 2014. The Stanford CoreNLP natural lan-guage processing toolkit. In Proceedings of ACL Sys¬tem Demonstrations, pages 55–60.
Tomas Mikolov, Martin Karafi´at, Lukas Burget, Jan Cer-nock`y, and Sanjeev Khudanpur. 2010. Recurrent neu¬ral network based language model. In Proceedings of INTERSPEECH, pages 1045–1048.
Aldrian Obaja Muis and Wei Lu. 2016. Weak semi-Markov CRFs for noun phrase chunking in informal text. In Proceedings of NAACL.
Kamel Nebhi, Kalina Bontcheva, and Genevieve Gorrell. 2015. Restoring capitalization in #tweets. In Proceed¬ings of WWW Companion, pages 1111–1115.
Naoaki Okazaki. 2007. CRFsuite: A fast implementa-tion of conditional random fields (CRFs).
Douglas B Paul and Janet M Baker. 1992. The design for the Wall Street Journal-based CSR corpus. In Pro¬ceedings of the Workshop on Speech and Natural Lan¬guage, pages 357–362.
Lawrence R Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recogni¬tion. Proceedings of the IEEE, 77(2):257–286.
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Re-search, 15(1):1929–1958.
Andreas Stolcke. 2002. SRILM-an extensible language modeling toolkit. In Proceedings of ICSLP, pages 901–904.
Tijmen Tieleman and Geoffrey Hinton. 2012. Lecture 6.5-rmsprop: Divide the gradient by a running aver¬age of its recent magnitude. COURSERA: Neural Net¬works for Machine Learning, 4(2).
Erik F Tjong Kim Sang and Fien De Meulder. 2003. In¬troduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of CoNLL, pages 142–147.
Oriol Vinyals, Łukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey Hinton. 2015. Gram-mar as a foreign language. In Proceedings of NIPS, pages 2755–2763.
Wei Wang, Kevin Knight, and Daniel Marcu. 2006. Capitalizing machine translation. In Proceedings of NAACL-HLT, pages 1–8.
A Comprehensive Black-box Methodology for
Testing the Forensic Characteristics of Solid-state Drives
Gabriele Bonetti, Marco Viglione, Alessandro Frossi, Federico Maggi, Stefano Zanero
DEIB, Politecnico di Milano
{gabriele.bonetti,marco.viglione}@mail.polimi.it, {frossi,fmaggi,zanero}@elet.polimi.it
ABSTRACT
Solid-state drives (SSDs) are inherently different from traditional drives, as they incorporate data-optimization mechanisms to over¬come their limitations (such as a limited number of program-erase cycles, or the need of blanking a block before writing). The most common optimizations are wear leveling, trimming, compression, and garbage collection, which operate transparently to the host OS and, in certain cases, even when the disks are disconnected from a computer (but still powered up). In simple words, SSD controllers are designed to hide these internals completely, rendering them inaccessible if not through direct acquisition of the memory cells.
These optimizations have a significant impact on the forensic analysis of SSDs. The main cause is that memory cells could be pre¬emptively blanked, whereas a traditional drive sector would need to be explicitly rewritten to physically wipe off the data. Unfortunately, the existing literature on this subject is sparse and the conclusions are seemingly contradictory.
In this paper we propose a generic, practical, test-driven method¬ology that guides researchers and forensics analysts through a series of steps that assess the “forensic friendliness” of a SSD. Given a drive of the same brand and model of the one under analysis, our methodology produces a decision that helps an analyst to determine whether or not an expensive direct acquisition of the memory cells is worth the effort, because the extreme optimizations may have rendered the data unreadable or useless. We apply our methodology to three SSDs produced by top vendors (Samsung, Corsair, and Crucial), and provide a detailed description of how each step should be conducted.
1. INTRODUCTION
Solid-state drives (SSDs) have reached remarkable popularity nowa¬days, as their increasing capacity and affordable prices made them a good alternative to standard, platter-based hard drives (HDD, from hereinafter) [10]. SSDs offer the flexibility and compatibility of traditional drives, long with the shock-resistance ensured by the lack of mechanical components typical of flash drives, and the speed offered by flash memories.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.
ACSAC ’13 New Orleans, Louisiana, USA
Copyright 2013 ACM 978-1-4503-2015-3/13/12 ...$15.00.
SSD have a shorter lifespan than HDDs. NAND-based flash chips, in fact, have a physical limit of around 10,000 program-erase cycles. When approaching and surpassing this limit, NAND floating gates exhibit problems in retaining their charge and, if not constantly refreshed, they lose their content. This means that keeping an SSD without power for a couple of days may lead to data loss. While 10,000 cycles may seem a very high number, it is a rather low lifes-pan for hard drives. Another limitation is that if blocks need to be rewritten, they must be blanked first, causing an extra overhead. This issue is further exacerbated in SSD drives, where the smallest addressable unit is a 16 to 512k block. SSD vendors have developed specific techniques such as write caching, trimming, garbage col¬lection and compression, which aim at reducing the actual number of physical program-erase cycles. With these optimizations, SSD controllers are much more active than HDDs controllers, which let write and read requests pass through.
As a consequence, existing and widely-adopted forensic data-acquisition and analysis procedures may not be completely suitable for SSDs (e.g., the hash of an SSD may not be “stable” over time, as obsolete data may be automatically wiped by internal optimizations). The only viable option is a white-box acquisition that bypasses the controller and reads the content of the NAND chips. Unfortunately, as explained in §2.1, a white-box acquisition is expensive, not al¬ways feasible, can possibly disrupt the drive, and may lead to the conclusion that data is lost or damaged. In this regard, it would be useful to have a simple and affordable (black-box) triage procedure to decide whether a white-box analysis may produce some usable outcome on given the SSD brand, model and release, and OS.
In this paper we propose a generalized, practical analysis method¬ology to selectively address the peculiarities of SSDs that may impact forensic acquisition and reconstruction. Our methodology is a test-driven workflow that guides the forensic analyst through a series of experiments. The goal of each experiment is to assess how the controller logic behaves under different conditions and provide the analyst with useful insights on how the SSD under examina¬tion works and what are the optimizations adopted. Given the SSD model, brand and release, and the OS (if any) used on that SSD, our workflow provides (i) insights on the potential impacts of such optimizations on the results of standard forensic tools, and (ii) a practical decision framework to determine the expected success rate of retrieving lost data through white-box analysis.
As our methodology is black-box, it is transparently applicable to any SSD brand and model without any modification. Throughout the paper, we show this by applying our workflow on three SSDs of different vendors, each with a different controller, chosen because they are the most used ones: we cover this way a very vast variety of devices on the market and can analyze their peculiar behaviors, directly tied to the controller they are built with. Regardless of
the specific experiments that we carry out for the sole purpose of demonstrating the practicality of our workflow, we show that a forensic analyst can use our tests to assess whether a certain feature is implemented in an arbitrary SSD. Based on the result of each assessment, for which we provide real-world results, the analyst can consider the possibility of proceeding with a white-box analysis (for instance, if the TRIM or wear leveling are implemented).
In summary, this paper makes the following contributions:
• We propose a test-driven, black-box methodology to deter¬mine whether a SSD implements trimming, garbage collec¬tion, compression and wear leveling.
• We show how our the outcome of our methodology guides the practitioners in understanding how they impact their chances of data retrieval using traditional black-box, or expensive white-box analysis techniques.
• We show our methodology by applying it on three popular SSD brands and models, and detail precisely how each step is conducted and how the results of each step are interpreted.
2. BACKGROUND AND MOTIVATION
SSDs employ a complex architecture, with many hardware and soft¬ware layers between the physical memory packs and the external interface to the computer. These layers, merged in the flash trans¬lation layer (FTL) [12], are in charge of reading and writing data on the ATA channel on one side and on the memory chips on the other side, as well as to compress, encrypt or move data blocks to perform optimizations. In HDDs the OS has direct access to the data contained on platters, and the controller is limited to moving the magnetic head and read or write data. Instead the FTL of SSDs performs much more complex functions: It translates logical block addresses (LBA) as requested by the OS into the respective physical block addresses (PBA) on memory chips. The underlying mapping is completely transparent and can be modified by the FTL at any time for any reason. The need for mechanisms such as the FTL has been studied extensively by Templeman and Kapadia [16], who show that the likelihood of a cell wearing (i.e., losing its ability of retaining data) and its maximum lifespan. They show that the endurance of memories greatly varies among the vendors and chip models, and that premature decay is caused by stressing cells with continuous writes.
2.1 White-box Forensics Analysis
The action of the FTL is transparent to software and to the host OS: To the best of current knowledge, there is no way to bypass the FTL via software, and explore the raw content of the memory chips. Hardware intervention, also known as white-box acquisition, is required. Breeuwsma et al. [7] showed that it is possible to acquire data from a flash memory chip in several ways. One option is to use flasher tools that interact with the chip directly via the pins of the board; other ways are the use of a JTAG port usually left by vendors on devices to bypass the controller or, in extreme cases, the physical extraction of the chip for dumping via standard readers.
Although a complete white-box analysis of a SSD is theoretically possible and in some cases feasible, it is also very difficult, time consuming and expensive, because it requires custom hardware. Creating custom hardware requires the forensics analyst to acquire specific skills, buy expensive equipment, and, once a successful ac¬quisition is finally carried out, spend a significant amount of time to reverse the implementation of the SSD’s controller policies. We at¬tempted to read directly from the memory chips using non-expensive
(i.e, tenths of US dollars) clips1 but we obtained a fragmented, in¬complete raw file. The techniques developed for small memories could be ported to SSDs, but results are not guaranteed. As a matter of fact, the applicability of white-box techniques highly depends on the disks architecture and hardware design. We performed an exploratory experiment with three drives (Samsung, Corsair, and Crucial) and limited hardware resources and found it very hard even to access the chips on the board, without disrupting them, or find accessible JTAG ports: Understandably, vendors tend to protect their intellectual property (i.e., the FTL algorithms) by not allowing this kind of access to the hardware. Last, but not least, white-box approaches must deal with data-compression features: The sole knowledge of the compression algorithm is not sufficient. Indeed, the analyst would need to know the compression algorithm and the data allocation policy (i.e., how bytes are spread over the mem¬ory chips), which is definitely a protected (or at least, not public) information.
2.2 Black-box Forensics Analysis
Differently from white-box approaches, black-box approaches read data as presented to the ATA interface by the SSD controller.
Bell and Boddington [5] analyzed the file recovery rate on SSDs versus HDDs during a standard, black-box forensic acquisition. When issued a quick format command, the SSD used in the exper¬iment wiped the entire content (i.e., text files) irrecoverably in a matter of minutes. They confirmed this result with a write blocker (i.e., re-attaching the SSD to a write blocker after the quick format), showing that this deletion did not happen as a result of commands issued by the host or its OS: SSDs can indeed schedule and per¬form their own write operations. This work provided one of the first hypothesis on how garbage-collection algorithms work, stating that some of them (primarily Samsung) are capable of “looking at the used/unused aspects of an NTFS filesystem by examining the free space bitmap”. The authors hypothesized that these controllers may be file-system aware, and need no OS intervention to blank unused blocks. This poses major issues, rendering traditional foren¬sic methodologies (such as the use of write blockers) insufficient to preserve the digital evidence. However, as we report in §4.2 we were unable to replicate their experiment, even using the same OS, scripts, drive (including firmwares and versions) and working conditions. Even with the authors’ help, we were unable to find the reason for this difference.
From the one hand, treating SSDs just like HDDs with black-box tools ensures partial observability over the controller’s behavior. On the other hand, black-box approaches are more practical and conve¬nient than white-box approaches. Notably, they are less obtrusive and expensive. Unfortunately, to the best of our knowledge, there is no SSD alternative, scientifically-tested black-box methodology that supports the forensic analyst in evaluating the soundness of the results produced by existing tools.
2.3 Challenges and Goals
When applied to SSDs, black-box and white-box approaches have symmetric advantages and drawbacks: the former fails when a drive performs internal optimizations silently, whereas the latter fails on proprietary hardware, which is difficult to manipulate and access, or when data is compressed or encrypted. Indeed, no single hardware tool or methodology can help with every SSD drive, since each of them has a different architecture, different chips positioning, and many other details that make “generic” hardware tools impossible to build. In addition, SSDs use different flash memory chips; these
1We used the TSOP NAND clip socket, available online for 29USD.
have very different working parameters. In some cases, it may be physically impossible to connect to the memory chips because of the way they are soldered to the board; as a consequence the chip often needs to be removed, thus potentially damaging the drive or destroying the evidence. SSD controllers are specifically built to reach high throughputs by leveraging parallel reads and writes; custom forensic hardware is much slower and can read only one chip at a time, making the dump (and the reconstruction) of an entire drive a very long process. Moreover, data compression or proprietary encryption may easily disrupt the entire white-box analysis, and also to the capabilities of certain controllers to wipe obsolete blocks and make it impossible to recover deleted files.
On the other hand, black-box approaches are less precise. How¬ever, being independent from the proprietary hardware, they can be easily generalized. Our key observation is that a black-box triage is a mandatory prerequisite before committing resources to a challeng¬ing, costly and potentially fruitless white-box analysis.
In summary, as every drive and every controller behave differently one from the other, we focus on providing a general methodology to perform forensically sound tests and determine how the FTL of a given SSD affects the results that standard forensic techniques may yield. Therefore, the main goal of our work is to devise an analysis methodology that advances the state of the art for its generality, and hopefully offers a useful reference for forensic practitioners and researchers. The second goal is to provide pure black-box techniques to “estimate” the likelihood of retrieving additional data through a white-box effort, allowing forensic experts to triage evidence and avoid wasting resources. Last, we strive to replicate and validate the experiments described in the literature to take into account the previous conclusions in our methodology.
3. METHODOLOGY OVERVIEW
The input of our methodology is an SSD of the same brand and model of the one under examination. The first step is to conduct a series of tests that determines whether that SSD implements cer¬tain features (regardless of what the vendors state), how fast and aggressive they are with respect to stored data and how they would influence forensic data reconstruction. The first step covers the following aspects:
TRIM This functionality mitigates the known limitation of SSDs that requires any block to be blanked before it can be rewrit¬ten. The trimming function erases data blocks that have been marked as “deleted” by the OS. Trimming has a negative im¬pact on forensic analysis. Indeed, on-disk data persistence after deletion is no longer guaranteed: Once a block is marked as free by the OS, the controller decides when to blank it ac¬cording to its policies. As noted in [5], this can occur regard¬less of data connection between the SSD and a host computer (e.g., during an acquisition, even when write blockers are used). Our methodology can determine the percentage of blocks that get erased and how fast this happens (§4.1).
Garbage collection (GC) This is a functionality that SSD vendors often list as one of the most useful and interesting feature, capable of greatly improving the drive’s performance. How¬ever, as explained in §3.1 and 4.2, from our investigation we
SSD
WL TRIM GC Compression
Corsair F60
Samsung S470
Crucial M4
Table 1: SSD features as reported by vendors.
conclude that it is very hard even to define the concept of garbage collection. Bell and Boddington [5] hypothesize that GC works by making the controller somehow aware of the filesystem, and able to infer on its own which blocks are ob¬solete by monitoring the file-allocation table. If this were the case, GC would bias forensic acquisitions significantly. GC is not triggered by the OS; consequently, data could be erased whenever the disk is powered on, even if no write commands are issued, thus rendering write blockers and other precautions useless. Therefore, it is important to know whether a SSD implements GC. Our methodology can determine whether GC is active.
Erasing patterns Some SSDs show peculiar behaviors when using TRIM: They do not erase all the marked blocks but rather a subset of them based on some drive’s internals. Our methodol¬ogy explores this behavior to characterize the erasing patterns (§4.3).
Compression Some drives transparently compress data to use less blocks and reduce cell wearing. Compression poses no direct challenges in black-box forensics acquisitions, whereas it makes white-box analysis useless, as the data read directly from the chips would be unusable, unless the compression algorithm were known or reverse engineered. Therefore, in our methodology we included a step that can verify whether compression is active.
Wear leveling (WL) This functionality reduces the usage of flash cells by spreading their consumption as evenly as possible across the drive. To do so, the FTL may activate wear lev¬eling if certain blocks have reached an excessive number of writes compared to the rest of the disk. The easiest and least expensive wear-leveling implementation allows the FTL to write the new block on another, less-used portion of the disk, and updates the internal file mapping table that the FTL main¬tains [2]. Alternatively, vendors could provide the disk with extra physical space, so that new blocks can be written on brand new memory cells. This second technique is used, for example, by the Corsair F60 SSD, which has a total of 64GB flash memory but allows to address only 60GB at a time; the drive itself is more expensive for the vendor, but if the wear leveling functionality is correctly implemented it grants a much longer lifespan. Although wear leveling is quite stan¬dard in modern drives, it is very useful to know whether the SSD implementation masks the effect of the so-called “write amplification” (see §4.5), which is a direct consequence of the wear leveling.
Files recoverability Even though blocks may not be erased, they might be changed or partially moved, making it impossible to retrieve them through carving. This test checks how many deleted files can be retrieved from the drive (§4.6).
Given the outcome of these tests, the second step is to interpret the results and provide a ranking of a drive in terms of its “forensic friendliness”, as detailed in §5. Our methodology covers all known combinations of factors that may trigger each feature. In addition, our methodology is designed to avoid redundant tests, which would certainly not trigger any of the features. When feasible, we compare our results with the outcome of previous studies. In particular, we validate the experiments of Bell and Boddington [5] on garbage collection, which is a particularly controversial and ill-defined topic as detailed in §3.1.
3.1 Garbage Collector vs. Garbage Collection
The difference between garbage collector and garbage collection is a controversial concept that needs to be clarified before explaining our methodology. Many works in the literature treat these two features as being substantially the same and propose methods to trigger and reverse-engineer this functionality. They are, however, two different (logical) components of an SSD controller, which unfortunately share a very similar name causing considerable confusion.
For the purpose of our work, the garbage collector is the process that deals with unused blocks—as a garbage collector does with unused variables in modern memory managers. The internals of SSD garbage collectors are not disclosed, and they may vary from drive to drive. However, it is known that the garbage collector is tightly tied to the TRIM functionality: among other capabilities, it has access to the “preemptive-erase table” (§4.1) filled by TRIM and takes care of physically wiping trimmed flocks. Additionally, the garbage collector helps wear leveling by moving around blocks whenever the wear factor of a cell is beyond a certain threshold.
On the other hand, vendors never disclosed any details about the garbage collection functionality, although there were some specu¬lations about how it is supposed to work. The only known work is by Bell and Boddington [5], who—partially supported by Samsung— hinted that the garbage collection functionality allows the drive con¬troller to introspect the file allocation table of known file systems (i.e., NTFS, ext3 and ext4) to autonomously decide which blocks can be safely wiped, without the OS intervention or the implementa¬tion of TRIM. As a matter of fact, in §4.2 we document our tests on the garbage collection functionality under these hypotheses to see whether the results from various drives could confirm its presence and behavior. Our experiment lead to results that contradict Bell and Boddington’s work.
3.2 Write Caching in SSD Experiments
SSDs are equipped with a small amount of DRAM-based cache memory, which functions is to reduce the number of physical writes. This feature must be taken into account when performing experi¬ments on SSDs, because its obvious side effects. Caching can be ignored in tests that use large files, as it is negligible with respect to the drive capacity and is therefore bypassed. However, it must be taken into account when performing fine-grained tests.
Biases introduced by caching were not addressed by Antonellis [4], who performed experiments by writing everyday-use graphic and text files on the SSD and then verified that they were completely unrecoverable after deleting them. Although in [4] there is no clear statement about th OS used, in 2008 no OS had TRIM support for SSD drives [3] and therefore the file deletion could not be a consequence of TRIM. The only explanation is that the files the author used were not big enough to completely fill the drive cache (usually around 512MB to 1GB) and were therefore never actually written on disk: they were simply erased from cache and no trace was left to allow a full or partial recovery. Also in [13] this effect is noticed: The percentage of recoverable blocks when using small files is considerably lower than the same percentage with big files, even under the same conditions and usage patterns. This can be explained by the fact that small files get usually stored in cache and not written to disk.
Our methodology requires that cache is disabled either from the OS (e.g., via the hdparm -W 0 command) or by using very large files for write operations whenever possible, to fill up the cache and force the drive to physically write on flash cells.
Figure 1: TRIM test flow.
15
10
5
0
0 10 20 30 40 50 60
Allocated space on disk [Gb]
Figure 2: The amount of blocks erased by TRIM in our Cor¬sair F60 disk depends on the amount of used space.
4. IMPLEMENTATION DETAILS
We apply the methodology described in this section on three SSDs, each with a different controller and combination of features, namely Corsair F60 (controller: SandForce SF-1200), a Samsung S470 MZ-5PA064A (controller: Samsung ARM base 3-core MAX) and a Crucial M4 (controller: Marvell 88SS9174-BLD2). As stated by the vendors, these drives implement wear leveling. Furthermore, the Corsair F60 performs data compression, whereas both Samsung S470 and Corsair F60 are said to implement garbage collection. Table 1 summarizes the functionalities according to the official specifications. Instead of presenting the results of our tests in a separate section, for the sake of practicality we explain the results immediately after the description of each step.
4.1 TRIM
Using trimming, whenever blocks are erased or moved on an SSD, they are added to a queue of blocks that should be blanked. This operation is lazily performed by the garbage collector process as soon as the disk is idle, making trimmed blocks ready to be written and ensuring balanced read-write speeds. Trimming is triggered by the OS, which informs the controller when certain blocks can be trimmed. We focused on Windows and Linux: Windows 7 and 8 (and Server 2008R2), and Linux from 2.6.28 on support trimming.
Methodology. Fig. 1 shows the steps required to determine whether and how an SSD implements trimming. Before start, the disk is wiped completely and the write cache is disabled or filled, as detailed in §3.2. Then, a stub filesystem is created and filled with random content (i.e., files) up to different percentages of their capacity: 25, 50, 75 and 100%. This is because certain controllers exhibit different trimming strategies depending on the available space. The tests described below are repeated for each percentage.
As both [13] and our experiments show that some TRIM imple¬mentations behave differently when dealing with (1) quick formats or (2) file deletions, we analyze both cases independently. When a (1) quick format command is issued, the OS will supposedly no¬tify the SSD that the whole drive can be trimmed. The disk is left idle and the filesystem is checked for zeroed blocks. If the SSD
implements TRIM procedures, we expect to observe changes in the number of zeroed blocks; otherwise, no changes will happen. To check for zeroing, we use a sampling tool—as Bell and Boddington [5] did—which loops over the entire disk and samples 10Kb of data out of every 10MB. It then checks whether the sample is completely zeroed. Whenever a non-zeroed sample is encountered, the tool checks whether it is zeroed in subsequent loops. The sampling size was chosen empirically to find a good trade-off in terms of overhead and accuracy. The choice of the sampling size depends on the time that the analyst wants to spend on this test, and only affects the final decision. The test ends when the situation does not change within a timeout, which can be sometimes obtained by the vendor’s documentation (i.e., the time between cycles of running of the garbage collector). If no documentation is available, we set a very long timeout (i.e., 24 hours, based on several trials that we run on all our disks). During our experiments, we found out that if the TRIM functionality is present and active, it triggers within 1 to 10 seconds. Similarly, (2) our workflow deletes single files from the filesystem and monitors their respective blocks. Depending on the file size, sampling may not be necessary in this case. The test ends, as before, when all the (remaining) portions of the erased file do not change within a timeout.
Results. We run this test on Windows and Linux, which both included stable support for TRIM.
On NTFS (Windows 7), Samsung S470 and Crucial M4’s trim-ming was very aggressive in both quick format and file deletion: The disk was wiped in 10 seconds by the Samsung S470 controller; on the Crucial M4, wiping occurred even before notifying the OS. Similar results were obtained with file deletion: the sectors were completely wiped in 5 and 10 seconds respectively. The Corsair F60, instead, behaved differently. After issuing a quick format, only a small percentage of data was erased; when we repeated the test at different filling levels, we surprisingly found out that the fraction of erased blocks is someway proportional to the total used space, as shown in Fig. 2: There are some thresholds that define how much space must be trimmed depending on the used space. In particular, there are 5 ranges in which the amount of zeroed space increases linearly, whereas it remains constant at all the other filling values. The Corsair F60 behaved unexpectedly also when dealing with file deletions: Some files were wiped in at most 3 seconds after deletion, whereas some other files were not wiped at all and could be recovered easily, depending on their allocation. This dis¬covery spurred an interesting study of the erasing patterns, which is explained separately in §4.3.
On ext4 (Ubuntu Linux 12.04) we obtained significantly different results. In the quick-format branch the outcome are similar across different disks: The entire content of the SSD was erased in about 15 seconds. This can be explained by the fact that for all the SSDs Linux used the same AHCI device driver. The single file deletion, instead, showed a different behavior. The Samsung S470 did not erase any block and all the files were completely recoverable. The Crucial M4 apparently did not erase any file, at least until the device was unmounted; at that point the blocks were erased. Apparently, the driver notifies a file deletion only when it becomes absolutely necessary to write data on disk (i.e., when the disk is unmounted or when the system is in idle state long enough to flush data on the non-volatile storage). The Corsair F60 showed none of the behaviors exhibited with NTFS: All the files were erased correctly. Supposedly, Windows drivers implement a different trimming policy, or the SandForce controller used by this SSD features NTFS-specific optimizations.
Start real-time analysis of disk free space
Quick format
Figure 4: Erasing patterns test flow.
4.2 Garbage Collection
Methodology. The entire test must carried out on an OS and drivers that do not support TRIM, to avoid interferences between TRIM and GC, which effects are indistinguishable from a black-box viewpoint. Fig. 3 summarizes our test to determine whether GC is implemented and when it starts.
After the usual preliminary steps, the dummy filesystem is cre¬ated and filled with random files. The content is not important, yet the size is, as small files are not physically written on disk. This is particularly important when write caching cannot be disabled reliably. Then, the same sampling procedure described in §4.1 is started, and a quick format is issued. As there is no reliable informa¬tion regarding the triggering context and timeout, our methodology explores two different paths. First, the disk is kept idle to allow the triggering of the GC. Alternatively, the SSD is kept active by continuously overwriting existing files, adding no new content. Bell and Boddington found out that GC triggers in almost 3 minutes. Some non-authoritative sources, however, state that a reasonable timeout ranges between 3 to 12 hours. Our methodology proposes to wait up to 16 hours before concluding the experiment.
Results. Even hours after the default timeout, none of the SSDs per¬formed GC. Since Samsung S470 and Corsair F60 were advertised as having GC capabilities, we devised a simple additional test to validate this result. This goal was to determine what percentage of non-random files can be recovered after a quick format. We filled each disk with the same JPEG image until there was no space left on the device, and formatted it on a TRIM-incompatible OS and let it idle. As shown in Table 5, even with simple tools (e.g., Scalpel), we recovered 100% of the files from both drives, confirming that no GC occurred. Note that we used a carver merely as a baseline for re¬coverability: Our approach is not meant to evaluate the performance of carvers in general.
Figure 5: Test results for erasing patterns test performed on Corsair F60 SSD: at different filling levels an increasing number of evenly-spaced stripes are visible. Green areas are zeroed by the controller, while blue areas remain unchanged. The non-erased blocks in the first stripe (a) contain the copy of the master file table and are therefore not zeroed.
4.3 Erasing patterns
As showed in §4.1, certain SSD controllers (e.g., Corsair F60 wit NTFS) may exhibit unexpected trimming patterns. Therefore, w devised a workflow to further explore these cases and assess to wha extent a forensic acquisition is affected by TRIM or GC.
Methodology. As shown in Fig. 4, after the preliminary steps th disk is filled with a dummy filesystem containing files with rando data (the content is irrelevant as we are focusing on how the con troller handles deletion). Then, a raw image of the disk is acquire with dd before issuing a quick format instruction. A second ra image is then acquired after a while (see the considerations in §4. on timeouts). Clearly, “raw” here refers to what the controller ex poses to the OS. The obtained images are compared block-wise t highlight erased sections. This test must be ran at different dis filling levels, in case the controller behaves differently based on th amount of free space.
Results. We applied this test on the Corsair F60 SSD, which exhib
ited odd behavior in the TRIM test. We analyzed it at 10, 50, 75 an 100% of space used. At each level we analyzed the entire disk a
explained above and created the maps shown in Fig. 5. Interestingl we notice four stripes in predictable areas (green) where the files ar surely going to be erased, whereas the rest of the disk (blue) is no modified even after file deletion. The small difference in the firs stripe is due to the fact that the master file table is allocated within it, and this portion is not erased (Fig. 5(a)).
This result is consistent with the results described in §4.1. In particular, the first four linearly-increasing portions that appear in the TRIM test result of Fig. 1 correspond to the very same green areas highlighted with this erasing-pattern experiment. Therefore, the reaction of the controller when dealing with file deletion on NTFS is explainable by the very same green areas: If a file is allocated within the green stripes, it will surely be erased by TRIM, whereas files that fall outside the green areas are not trimmed.
We validated this result as follows. We formatted the drive and filled it with easily-recoverable files (i.e., JPEG image files, as the JPEG header is easy to match with carvers). Then, we selectively deleted the files allocated inside or outside the green stripes and, after acquiring the entire disk image, tried to recover them, and to map their (known) position against the stripes position. Table 2 shows that only 0.34% of the files within the erased stripes are recovered, whereas this percentage reaches 99% for files allocated entirely outside the green areas, thus confirming the results of Fig. 5.
Position
Written Recovered %
Within erased stripes 29,545 1 0.34%
Outside erased stripes 71,610 71,607 99%
Table 2: File-recoverability test for Corsair F60 SSD: Only one of the files that were written within the erased areas could be recovered, whereas 99% of those outside those bounds could be retrieved with standard tools.
OS write cache disabling
Figure 6: Compression test flow.
4.4 Compression
Methodology. The key intuition behind our test is that the overhead due to hardware compression is negligible in terms of time. Thus, it will take considerably less time to physically write a compressed file with respect to an uncompressed one. However, this could not be the case if the controller actually goes back and compresses the files afterward as a background task.
As shown in Table 3, compression algorithms yield the best re¬sults with low-entropy files, whereas are not very effective on high-entropy data. We therefore created two different files with very different levels of entropy: /dev/zero and /dev/urandom. The methodology is summarized in Fig. 6. After creating the two files (10Gb each, to bypass write caching), we monitor via iostat the time spent in transfer and the throughput: A high throughput indi¬cates compression, as less data is physically written on disk.
Results. As Fig. 7(a) shows, both file transfers in the Samsung S470 took almost the same time, showing no sign of compression.
The Crucial M4 test showed in Fig. 7(b) yielded the same results, even if with different values. This drive exhibits systematic per¬formance glitches at the same points in time for each run. This happens almost every 25 seconds, regardless of the file’s size and transfer time, and does not happen with other drives under the same conditions. We transferred files of different sizes and under differ¬ent conditions (i.e., computer, source, OS) but obtained consistent results. We can speculate that these glitches are due to some compu¬tations performed by the controller at regular time intervals.
The Corsair F60 is the only one advertised as having compression capabilities. Indeed, as shown in Fig. 7(c), it behaves in a very dif¬ferent fashion: The transfer time for compressible files is about one third files than for incompressible files. Therefore, we can infer that the actual amount of data physically written on disk is considerably lower, meaning that the controller compresses it transparently.
File gzip 7zip bz2 Entropy
/dev/zero
1,030 7,086 1,420,763 0.0
/dev/urandom 1.0 0.99 0.99 0.99
Table 3: Compression ratio of the files used for compression test: files with high (Shannon’s) entropy are difficult to com-press and therefore result in more data to be written on disk.
00:00 00:30 01:00 01:30 02:00 02:30 03:00 03:30 04:00 04:30
Time [m:s]
(a) Samsung
70
60
50
40
30
20
10
0
00:00 00:30 01:00 01:30 02:00 02:30 03:00 03:30 04:00 04:30
Time [m:s]
00:00 00:15 00:30 00:45 01:00 01:15 01:30 01:45
Time [m:s]
(b) Crucial
120
100
80
60
40
20
0
00:00 00:15 00:30 00:45 01:00 01:15 01:30 01:45
Time [m:s]
00:00 00:10 00:20 00:30 00:40 00:50
Time [m:s]
(c) Corsair
70
60
50
40
30
20
10
0
00:00 00:30 01:00 01:30 02:00 02:30 03:00 03:30
Time [m:s]
Figure 7: Mean and variance of the sampled throughput among 15 repeated transfers of 10GB low and high-entropy files (top and bottom row, respectively). For (a) and (b) low and high-entropy file transfers have almost the same shape and duration, showing that the controller does not perform any kind of optimization (i.e., compression) on data before writing it. On the other hand, in (c) throughput with low-entropy files is considerably higher and the ifif1/3 than the i
files transfer. This result confirms that less data had to be physically written on disk, which means that compression was indeed performed by Corsair drives.
All graphs show an initial transitory phase with a very high transfer rate. However, this is the effect of write caching on the very first megabytes of the file being sent to the disk. Disabling write caching via software drivers, as explained in §3.2, does not always succeed. The only reliable way to bypass caching was to use large files. Nevertheless, the effect of caching does not affect our results.
4.5 Wear Leveling
Wear leveling is quite common, although none of the examined vendors clearly states what happens to the old versions of the same file (i.e., how write amplification [11] is treated). From a black-box viewpoint there are two possible situations. One alternative is that old blocks are not erased and remain where they were: in this case a carver may be able to extract many different versions of the same file, representing a clear snapshot of the data at a given point in time. Alternatively, the old data may be erased, moved out of the addressable space or simply masked by the controller, which in this case would tell the OS that no data is present (virtually zeroed block) where obsolete data actually is. Unfortunately, if we get no data from the disk, there is no way (with a black-box approach such as ours) to determine which of these is the case.
Another detail to take into account when dealing with wear lev¬eling is that vendors do not explicitly reveal the conditions under which the functionality is triggered. From the available information and previous work ([1, 2, 9]) it appears that two conditions must hold: there should be enough free blocks with a lower write count than the one that is being overwritten, and there must be a certain difference between the writes of the used block and the ones of the new destination block. Since the precise values depend on the ven¬dors and are not publicly available, we erred on the side of caution and left, in our experiments, at least 25% of the disk capacity free
Creation of a file with known pattern
Carving to find multiple copies of known file
Figure 8: Wear leveling test flow.
and overwrote the same blocks more than 10,000 times to cover whatever write cycles gap may be present.
Methodology. Our test is not aimed at determining if an SSD im¬plements a wear leveling feature, since this is pretty much standard nowadays. From the forensic viewpoint, what matters is if wear leveling can be leveraged via black-box analysis to recover data. If a drive has no wear leveling capabilities, or if write amplification is completely masked by the FTL, the end result is that nothing is lost and nothing is gained because of wear leveling.
Our test flow is shown in Fig. 8. An important preliminary step is to disable the OS and disk write caching, as it poses problems:
Figure 9: Files recoverability test flow.
The entire test flow requires the continuous re-writing of the same files, and it is extremely important that these write operations are physically sent to the disk.
Then disk is filled up to 75% with a dummy filesystem. Since wear leveling is internal, the file or filesystem type has no impact, so we choose files with known patterns (to ease carving operations afterward), and an ext4 filesystem under Ubuntu Linux.
At this point files are overwritten with new data (of exactly the same size) a total of 10,000 times while keeping zeroed space on disk monitored. If, at any time, zeroed space diminishes, it means that the controller wrote new data somewhere else, using less used blocks but leaving the old ones intact. Notice that remnant data could be garbage or unusable. For this reason, and only if the zeroed-space check gives positive results, it is advisable to perform a full disk acquisition with subsequent carving to determine if different versions of the same files are effectively recoverable. As said, in the very likely case that both checks yield negative results, we cannot know what the controller is really doing; what we know is just that a standard forensic procedure will not be impacted by wear leveling.
Results. We ran our test on all the disks in our possession. As expected, it is very unlikely to find a drive exposing multiple copies of the same files for a black-box analysis. Our results confirmed our expectation: both checks yield negative results. We know, however, that all of the drives actually implement wear leveling capabilities, as stated by their vendors. We can therefore only assume that the effect of write amplification are completely masked by the controller, which does not expose any internals.
SSD FS Written Recovered %
Samsung NTFS 112,790 0 0%
ext4 110,322 0 0%
Corsair NTFS 101,155 71,607 70.79%
ext4 99,475 0 0%
Crucial NTFS 112,192 0 0%
ext4 110,124 0 0%
Table 4: Files recoverability test results: the drives imple-menting an aggressive version of TRIM (Samsung S470 on NTFS and Crucial M4), did not allow the recovery of any file after a format procedure. The Corsair F60 on NTFS, as expected, has a non-null recovery rate due to the erasing pat¬tern its TRIM implementation exposes. On ext4, however, this same disk allowed the recovery of 0 out of 99,475 files.
Figure 10: Use case workflow for assessing the forensic friendliness of a SSD.
4.6 Files Recoverability
The tests described so far are all aimed at determining if some func-tionalities implemented by a given SSD are forensically disruptive, to ultimately allow a forensic analyst to assess whether some data is still retrievable. What usually interests the forensic analyst most, however, is being able to access and retrieve files on an SSD much in the same way as on a traditional HDD.
The tests that we propose in this section determines how much an SSD behaves similarly to a HDD from a data-recoverability viewpoint. In HDDs, recoverability is affected by the filesystem policies on overwriting previous data. In SSDs, in addition to this, trimming, garbage collection, and the other unexpected controller behavior described so far negatively impact the recoverability.
Methodology. The test flow is shown in Fig. 9. The drive is first initialized with a dummy filesystem and filled with “carver-friendly” files: In our case, we wrote JPEG files of around 500k each, and then quick formatted the drive. After the usual 24 hours timeout, we used Scalpel to attempt a file recovery.
Results. We ran our experiment on all our disks with a NTFS filesystem with enough copies of the same JPEG image to fill the entire drive. As summarized in Table 4, both the Crucial M4 and the Samsung S470 have a zero recovery rate, which means that the TRIM functionality tested in §4.1 actually works and erases all of the deleted files.
The Corsair F60 behaves differently, as shown in §4.3: 71,607 files out of the 101,155 were recovered, totaling a 70.79% recovery rate on NTFS. Curiously, all the files that were only partially recovered— or not recovered at all—were all contiguous in small chunks. On Ext4, instead, TRIM did not allow the recovery of any file.
SSD
Written Recovered %
Samsung 112,790 112,790 100%
Corsair 101,155 101,155 100%
Table 5: Files recoverability without TRIM on Samsung S470 and Corsair F60 drives.
5. USE CASE: RANKING DRIVES
Although proposing a comprehensive and accurate classification of SSDs goes beyond the scope of this paper, we show how our method¬ology can be applied to indicate “forensic friendliness” of an SSD. We consider the outtput of the TRIM, GC and File Recoverability tests. We follow the workflow exemplified in Fig. 10.
• A. Platter-disk Equivalent. The SSD behaves as a HDD. Standard forensics tools are expected to work as usual. SSDs in this class present no disruptive behaviors (e.g., TRIM, GC).
• B. High Recoverability. TRIM and other wiping function-alities are implemented but they are not very aggressive: an HDD-equivalent recovery is expected.
• C. Low Recoverability. SSD’s functionalities are quite ag¬gressive and succeed in deleting or masking most of the deleted data that could have been recovered from a HDD. It is, however, still possible to achieve some results with stan¬dard tools.
• D. Complete Wiping. No deleted data can be recovered using standard black-box tools. White-box analysis may be a solution but it is not guaranteed to yield acceptable results. This is the worst possible case when performing a forensic analysis on a SSD.
Applying this method to our drives we obtained the following clas¬sification. Our Crucial M4 implements a very effective TRIM func¬tionality with any filesystem which directly makes the recoverability test yield a 0% rate so, even though the garbage collector does not trigger, the associated class is D. Complete wiping: this drive is very likely to make recovery of deleted files impossible. The same happens on the Samsung S470 with both NTFS and ext4 filesystems and on the Corsair F60 with ext4. Corsair F60 SSD with NTFS, instead, present only a partially working TRIM implementation, which allows the recovery of almost 71% of deleted files; this com¬bination between drive and filesystem is therefore associated to a B. High recoverability class.
6. LIMITATIONS
Although our blackbox workflow and experimental results are far more complete than what has been proposed in previous work, there are some limitations that is important to be aware of.
First, each SSD comes with its own firmware version, which basically embeds (part of) the FTL logic. As such, it determines the SSD characteristics and, therefore, its forensics “friendliness” with respect to the features tested by our workflow. We do not consider changing the firmware during our tests for several reasons. First, not all SSD vendors release firmware upgrades. Secondly, and most importantly, firmware upgrades are often one-way procedures; this affects the repeatability of the experiments in a scalable way (i.e., the only way to downgrade a firmware to a previous version would consist in buying another SSD, provided that the old version of the firmware is still on the market).
Second, the triggering of the TRIM depends on the specific com¬bination of OS, filesystem type, device driver, and AHCI commands implemented. The current version of our workflow explores the OS and filesystem type. For instance we have shown in our experiments how the Corsair behaves differently under Windows (NTFS) and Linux (ext4). In a similar vein, other variables such as the device driver and AHCI commands can be considered. The reason why we have not included these variables into our methodology is because they vary significantly from product to product, whereas our main goal was to provide a generalized testing workflow.
Last, an intrinsic limitation of our approach is that the forensics examiner needs to know the OS version before performing an in¬vestigation. The availability of is this contextual information varies from case to case and it is hard to find previous data about this (due to the nature of the forensic investigations). Therefore we cannot make any strong statement on how likely it is to have such an information available.
7. RELATED WORK
In this section, we overview other relevant researches in the area of SSD forensics, in addition to the works presented in §2.
7.1 White-box Forensics Analysis
Similarly to Breeuwsma et al. [7], Skorobogatov [15] also addressed data acquisition from flash memory chips, but at a lower level. His technique, however, is not suitable for forensic purposes because of the non-optimal recovery rate.
The state of the art in white-box analysis is the work by Bunker et al. [8] and [17], who built a complete custom setup to interact with flash memory chips using an FPGA and several custom wing boards to enhance its compatibility. Although their goal is to enable easy development and prototyping of FTL algorithms, compression, cryptography and sanitization protocols, the same setup can be theoretically used to re-implement part of the FTL functionalities to ease white-box acquisition of SSDs. However, the internals of a controller are usually undocumented; therefore, it may be very difficult to reconstruct files directly from the acquired data, and traditional file carvers are likely to fail. Although Billard and Hauri [6] showed how is possible to analyze a raw flash dump and reconstruct files without prior knowledge of the disposition of blocks performed by the FTL, their technique works only with small capacity chips (in the order of hundreds of megabytes).
Luck and Stokes [14] concentrated on FAT structure and demon¬strated how to rebuild audio and video files from dumped NAND memories; their work suffers from the very same shortcoming: It is tailored for small amounts of data (e.g., cellphone memories). Also, data reconstruction is made even more difficult by SSD con¬trollers because they often make use of data parallelism over the flash memory chips on the board.
Differently from white-box techniques, our proposed methodology is extremely convenient, practical and, most importantly, guarantees that the SSD is never damaged.
7.2 Black-box Forensics Analysis
Similarly to Bell and Boddington [5], King and Vidas [13] per-formed experiments on 16 different SSDs: They simulated real usage scenarios and tested the block-level recoverability. Each sce¬nario was replicated under three OSs (Windows XP, Windows 7 and Ubuntu Linux 9.04). Their conclusion is that different combinations of usage, OS and file size influence the forensic recoverability of the SSD. Although this is by far the most exhaustive test on SSDs, the authors focus solely on data deletion as effect of TRIM and garbage collection, without generalizing their findings. What is missing is an in-depth study of the correlation between the environment condi¬tions (e.g., OS, filesystem, file size), the internal state of a disk (e.g., amount of free space, wear) and the corresponding behavior of the SSD. Our work goes beyond [13] because we designed and evalu¬ated a comprehensive, test-driven methodology to fully understand the reasons behind each specific behavior.
Antonellis [4] analyzed the behavior of one of the first SSDs with respect to file deletion. He wrote some typical files such as documents and images on an NTFS-formatted SSD, and then erased them to see how much data was recoverable afterward. Surprisingly,
none of the files was recoverable via carving. These experiments, however, were narrowed to a single scenario and did not take into account all the possible factors that our methodology accounts for. For instance, as detailed in §3.2 and §4, thanks to our methodology we can explain why caching is the most reasonable explanation of Antonellis [4]’s odd results. Last, [4] focused on TRIM, whereas our methodology considers the features implemented in most of the SSDs currently on the market.
8. CONCLUSIONS
In order to overcome the intrinsic limitations of SSDs, onboard controllers adopt a number of advanced strategies, preemptively erasing blocks of deleted files (possibly even if not solicited by the OS) and even performing compression or encryption on the data. Each vendor implements a different controller, and therefore each SSD acts differently. Consequently, SSDs cannot be treated as standard HDDs when performing a forensic analysis: Standard tools and well-known techniques are based on the assumption that the hard drive does not modify or move data in any way, that every block can be read if it was not previously wiped, and that reading a block will yield the data physically contained by that block.
We proposed a complete testing methodology and applied it to three drives of leading vendors. For each test we interpreted the results to provide the forensic analyst a practical way to fine tune their approaches to the acquisition of SSD drives, which can be very similar to acquisition of HDDs, or completely different. Indeed, we showed that the combination of controller, OS, filesystem and even disk usage can deeply influence the amount of information that can be retrieved from a disk using forensic procedures. We advanced the state of the art by showing further, previously-unknown findings, such as the one described in §4.1: even though all the drives expose the TRIM functionality to the OS, our methodology allowed us to investigate the peculiarities of each implementation (e.g., a drive selectively wiping blocks only in specific portions of it). We also proposed a test aimed at verifying the implementation of a compression feature on the SSD. We showed that our test can indeed devise whether the drive uses compression or not; this is useful to determine the feasibility of a deeper white-box approach. We also investigated the controversial topic of garbage collection to see under what conditions it triggers.
Each of the proposed experiments is tailored to inspect how each functionality works and how “aggressive” it is. They allow the analyst to know, for example, if the SSD has a disruptive TRIM im¬plementation or if GC does not work under particular conditions. All these information can be of great help when performing a forensic acquisition on such drives, since they can change the expectations or even suggest some particular techniques to adopt in order not to al¬low the controller to delete possible useful data. Also, the result can help the forensic expert to estimate the success of costly procedures such as a memory white-box analysis.
Besides addressing the aforementioned limitations, future work can focus on running our methodology on a wide range of OSs, SSD brands and models, in order to create a reference catalog useful for forensic investigations.
REFERENCES
[1] Wear leveling in micron NAND flash memory. Tech¬nical Report TN-29-61, Micron Technology Inc., 2008.
URL http://www.micron.com/~/media/Documents/
Products/Technical%20Note/NAND%20Flash/tn2961_ wear_leveling_in_nand.pdf.
[2] Wear-leveling techniques in NAND flash devices. Tech-nical Report TN-29-42, Micron Technology Inc., 2008.
URL http://www.micron.com/~/media/Documents/
Products/Technical%20Note/NAND%20Flash/tn2942_ nand_wear_leveling.pdf.
[3] Microsoft Co.: Support and Q&A for Solid–State Drives. MSDN Blog http://blogs.msdn.com/b/e7/archive/ 2009/05/05/support-and-q-a-for-solid-state-drives-and.aspx, 2009.
[4] Christopher J. Antonellis. Solid state disks and computer forensics. ISSA Journal, pages 36–38, 2008.
[5] Graeme B. Bell and Richard Boddington. Solid state drives: The beginning of the end for current practice in digital forensic recovery? Journal of Digital Forensics, Security and Law, 5 (3), December 2010.
[6] David Billard and Rolf Hauri. Making sense of unstructured flash-memory dumps. In SAC ’10, pages 1579–1583, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-639-7.
[7] Marcel Breeuwsma, Martien De Jongh, Coert Klaver, Ronald Van Der Knijff, and Mark Roeloffs. Forensic data recovery from flash memory. Small Scale Digital Device Forensics Journal, 1:1–17, 2007.
[8] Trevor Bunker, Michael Wei, and Steven Swanson. Ming II: A flexible platform for NAND flash-based research. Technical Report CS2012-0978, UCSD CSE, 2012.
[9] Yuan-Hao Chang, Jen-Wei Hsieh, and Tei-Wei Kuo. Improving flash wear-leveling by proactively moving static data. Com¬puters, IEEE Transactions on, 59(1):53 –65, jan. 2010. ISSN 0018-9340.
[10] Jim Gray and Bob Fitzgerald. Flash disk opportunity for server applications. Queue, 6(4):18–23, July 2008. ISSN 1542-7730.
[11] Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. Write amplification analysis in flash-based solid state drives. In SYSTOR ’09, pages 10:1–10:9, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-623-6.
[12] Intel. AP-684: Understanding the flash translation layer (FTL) specification. Intel Application Note. http://www.jbosn. com/download_documents/FTL_INTEL.pdf, 1998.
[13] Christopher King and Timothy Vidas. Empirical analysis of solid state disk data retention when used with contemporary operating systems. volume 8, pages S111–S117, Amsterdam, The Netherlands, The Netherlands, August 2011. Elsevier Science Publishers B. V.
[14] James Luck and Mark Stokes. An integrated approach to recovering deleted files from nand flash data. Small Scale Digital Device Forensics Journal, 2(1):1941–6164, 2008.
[15] Sergei P. Skorobogatov. Data remanence in flash memory devices. In Cryptographic Hardware and Embedded Systems - CHES 2005, 7th Intl. Workshop, Edinburgh, UK, August 29 - September 1, 2005, Proc., volume 3659 of Lecture Notes in Computer Science, pages 339–353. Springer, 2005.
[16] Robert Templeman and Apu Kapadia. Gangrene: exploring the mortality of flash memory. In HotSec’12, pages 1–1, Berkeley, CA, USA, 2012. USENIX Association.
[17] Michael Wei, Laura M. Grupp, Frederick E. Spada, and Steven Swanson. Reliably erasing data from flash-based solid state drives. In FAST’11, pages 8–8, Berkeley, CA, USA, 2011. USENIX Association. ISBN 978-1-931971-82-9.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence
A Tractable Approach to ABox Abduction over Description Logic Ontologies
Jianfeng Du
Guangdong University of
Foreign Studies,
Guangzhou 510006, China
jfdu@gdufs.edu.cn
Kewen Wang
Griffith University,
Brisbane, QLD 4111, Australia
k.wang@griffith.edu.au
Yi-Dong Shen
State Key Laboratory of
Computer Science,
Institute of Software,
Chinese Academy of Sciences,
Beijing 100190, China
ydshen@ios.ac.cn
Abstract
ABox abduction is an important reasoning mechanism for description logic ontologies. It computes all mini¬mal explanations (sets of ABox assertions) whose ap¬pending to a consistent ontology enforces the entail¬ment of an observation while keeps the ontology con¬sistent. We focus on practical computation for a general problem of ABox abduction, called the query abduction problem, where an observation is a Boolean conjunc¬tive query and the explanations may contain fresh indi¬viduals neither in the ontology nor in the observation. However, in this problem there can be infinitely many minimal explanations. Hence we first identify a class of TBoxes called first-order rewritable TBoxes. It guaran¬tees the existence of finitely many minimal explanations and is sufficient for many ontology applications. To re¬duce the number of explanations that need to be com¬puted, we introduce a special kind of minimal expla¬nations called representative explanations from which all minimal explanations can be retrieved. We develop a tractable method (in data complexity) for computing all representative explanations in a consistent ontology. Experimental results demonstrate that the method is ef¬ficient and scalable for ontologies with large ABoxes.
Introduction
In artificial intelligence, abductive reasoning (Eiter and Got-tlob 1995) is an important logic-based mechanism for com¬puting explanations for an observation that is not entailed by a background theory, where an explanation is a set of facts that should be added to the theory to enforce the entail¬ment. Many ontology applications require to explain why an observation cannot be entailed by an ontology. As a result, abductive reasoning has regained much attention in descrip¬tion logics (DLs), which underpin the standard Web Ontol¬ogy Language (OWL). A DL ontology consists of a TBox storing intensional information and an ABox storing exten¬sional information. ABox abduction (Klarman, Endriss, and Schlobach 2011; Du et al. 2011a) is an adaptation of abduc-tive reasoning to DLs. It computes all minimal explanations (sets of ABox assertions) whose appending to a given con¬sistent DL ontology enforces the entailment of an observa¬tion while keeps the ontology consistent.
Copyright c 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
ABox abduction was advocated in (Elsenbroich, Kutz, and Sattler 2006) where some use scenarios such as medi¬cal diagnosis were presented. It was emphasized in (Bada, Mungall, and Hunter 2008) as a feature of support tools for ontology quality control. It was also adapted to semantic matchmaking by treating requests as observations and offers as an ABox (Du et al. 2011b). Recently, ABox abduction was applied to explain why a tuple is not an answer to a con¬junctive query (Calvanese et al. 2013). As was pointed out in (Borgida, Calvanese, and Rodriguez-Muro 2008), this fa¬cility is as important as explaining why a tuple is a query answer in ontology-based data access (OBDA) systems.
Although ABox abduction is important and useful, the problem of computing minimal explanations for ABox ab¬duction is rarely investigated. An initial attempt to com¬pute all minimal explanations was proposed in (Klarman, Endriss, and Schlobach 2011), where a given ontology is expressed in the DL ALC and a given observation is a set of ABox assertions. Since the explanations can be over arbitrary ALE concepts, the number of minimal explana¬tions can be infinite and the computational method pro¬posed in (Klarman, Endriss, and Schlobach 2011) can¬not guarantee termination. For example, consider an on¬tology consisting of a single axiom hasParent.Person C_ Person, which says that something having a person as its parent is a person (Du et al. 2011a). Then the ob¬servation that Tom is a person (written {Person(Tom)}) can have infinitely many minimal explanations of the form {hasParent. • • • .hasParent.Person(Tom)}. To ad¬dress the termination problem, Du et al. (2011a) considered ABox abduction from a more practical perspective. They require the explanations to be over a finite set of concept names and role names (called abducible predicates) and to contain individuals in the given ontology only so that all minimal explanations can be computed in finite time. They also propose a method for computing all such minimal ex¬planations in DLs that can be more expressive than ALC.
Recently, Calvanese et al. (2013) considered the com-plexity aspect of a general problem of ABox abduction, called the query abduction problem. This problem takes Boolean conjunctive queries (BCQs) as observations and allows fresh individuals neither in the ontology nor in the observation to appear in explanations. For expressing ob¬servations, BCQs are more general than sets of ABox as
1034
sertions since a set of ABox assertions can be treated as a BCQ. Allowing fresh individuals can bring more intuitive explanations. Take the aforementioned ontology for example. Suppose both hasParent and Tom are ab-ducible predicates. According to the definition of ABox abduction proposed in (Du et al. 2011a), the observation Person(Tom) has only one minimal explanation = hasParent(Tom, Tom), Person(Tom). By the definition of the query abduction problem, the same observation has some other minimal explanations that are more intuitive, such as = hasParent(Tom, u), Person(u) where u de¬notes a fresh individual. is more intuitive than since Tom does not necessarily have a parent who is Tom itself. On the other side, allowing fresh individuals may lead to in¬finitely many minimal explanations. For the above example, the observation Person(Tom) actually has infinitely many minimal explanations of the form hasParent(Tom, u1), hasParent(u1, u2), ..., hasParent(un1, un), Person(un), where u1, ... , un are fresh individuals. This makes the is-sue of computing all minimal explanations for the query ab¬duction problem much tougher than that for the problem of ABox abduction addressed in (Du et al. 2011a).
Towards practical computation for the query abduction problem, we first identify a class of TBoxes called first-order rewritable TBoxes. We show that it guarantees the existence of finitely many minimal explanations and is sufficient for many ontology applications, such as the OBDA systems. Moreover, when the TBox is first-order rewritable, the set of minimal explanations for a BCQ can be computed in poly¬nomial time in terms of data complexity, i.e. the complexity measured in the size of the ABox only.
However, the set of minimal explanations for the query abduction problem with a first-order rewritable TBox can still be too large to be computed. Consider an ontology hav¬ing n axioms r1. A1, . . ., rn. An in the TBox and m assertions A1(a1), ... , A1(a,..) in the ABox. Sup¬pose all role names in the ontology are abducible predicates, then the BCQ A1(a), ... , An(a) will have (m+ 1)n min¬imal explanations of the form r1(a, u1),... , rn(a, un), where ui is an individual in a1, ... , a,.. or a fresh indi¬vidual. To reduce the number of explanations that need to be computed, we propose representative explanations that are minimal explanations not strictly subsumed by other minimal explanations. A minimal explanation is said to be strictly subsumed by another one if can become a subset of by replacing fresh individuals with existing or fresh individuals, but cannot vise versa. The number of representative explanations can be much smaller than that of minimal explanations. For the above example, the BCQ A1(a), ... , An(a) has only one representative explana¬tion r1(a, u1),... , rn(a, un), where u1, ... , un are dif¬ferent fresh individuals. Moreover, we show that the set of minimal explanations can be retrieved from the set of repre¬sentative explanations.
We propose a tractable method (in data complexity) for computing all representative explanations in a consistent on¬tology whose TBox is first-order rewritable. It does not need to compute all minimal explanations beforehand. We com¬pare this method with the state-of-the-art method for com¬
puting all minimal explanations (Du et al. 2011a). Exper¬imental results show that, when both methods compute the same set of explanations, the proposed method is much more efficient and more scalable; moreover, when role names are used as abducible predicates, the computation of all minimal explanations is often impractical, but the set of representa¬tive explanations can still be efficiently computed.
Due to the space limitation, proofs in this paper are only provided in our technical report (Du, Wang, and Shen 2014).
Preliminaries
We assume that the reader is familiar with DLs (Baader et al. 2003). We only recall that a DL ontology consists of a TBox and an ABox, where the TBox contains axioms declaring the relations between concepts and roles, such as concept inclusion axioms C D and role inclusion axioms r s, and the ABox contains assertions declaring the membership relations between individuals and concepts or roles as well as (in)equivalence relations among individuals.
We assume that the Unique Name Assumption (Baader et al. 2003) is adopted and only consider ABoxes consisting of basic assertions, namely concept assertions of the form A(a) and role assertions of the form r(a, b), where A is a concept name, r is a role name, and a and b are individu¬als. Other concept assertions and role assertions can be nor¬malized to basic ones in a standard way. Let Σ be a set of concept names and role names. An ABox that contains only concept names and role names from Σ is called a Σ-ABox.
We use the traditional semantics for DLs given e.g. in (Baader et al. 2003). A DL ontology is said to be con-sistent, denoted by = , if it has at least one model; otherwise, it is inconsistent, denoted by = .
A Boolean conjunctive query (BCQ) is of the form x Φ(x, c), where Φ(x,c) is a conjunction of atoms over concept names and role names, where x are variables and c are individuals. A BCQ is written as and can also be treated as a set of atoms. For example, the BCQ x A(a) r(a, x) is written as A(a), r(a, x). A substitution for a BCQ Q is a mapping from variables in Q to individuals or variables; it is called ground if it maps variables in Q to individuals only. A BCQ Q is said to be entailed by a DL ontology O if Q is satisfied by all models of O, written O= Q.
Some DLs can be translated to Datalog (Cal`ı, Gott-lob, and Lukasiewicz 2012). A Datalog ontology con-sists of finitely many tuple generating dependencies (TGDs) xy Φ(x, y) z ϕ(x, z), negative constraints (simply constraints) x Φ(x) , as well as equality generating dependencies (EGDs) x Φ(x) x1 = x2, where Φ(x, y), ϕ(x, z) and Φ(x) are conjunctions of atoms, x1 and x2 oc¬cur in x, and denotes the truth constant false. The por¬tions of a TBox T that are translated to TGDs, constraints and EGDs are denoted by TD, TC and TE, respectively. We introduce the notion of first-order rewritable TBox below.
Definition 1. A TBox is said to be first-order rewritable if it can be translated to a Datalog ontology and satisfies the following three conditions for an arbitrary BCQ Q and an arbitrary Σ-ABox A, where Σ is the set of concept names and role names in T:
1035
(1) = Q if and only if D = Q or = ;
(2) C E can be rewritten (according to D) to a finite set of BCQs, denoted by γ (C E, D), such that = if and only if = Q' for some Q' γ(C E, D);
(3) Q can be rewritten (according to D) to a finite set of BCQs, denoted by τ(Q, TD), such that TD A= Q if and only if A= Q' for some Q' τ(Q, TD).
A TBox will be first-order rewritable if it can be trans¬lated to a Datalog± ontology that consists of linear (multi-linear, sticky, or sticky-join) TGDs, constraints and special EGDs that are separable from TGDs (Cal`ı, Gottlob, and Lukasiewicz 2012; Cal`ı, Gottlob, and Pieris 2012). For many DLs in the DL-Lite family (Calvanese et al. 2007), such translations exist and have been given in (Cal`ı, Gottlob, and Lukasiewicz 2012; Cal`ı, Gottlob, and Pieris 2012). Since the DL-Lite family has become popular in many ontology ap-plications, such as the OBDA systems, first-order rewritable TBoxes are sufficient for these applications. Therefore, we focus on first-order rewritable TBoxes.
The Query Abduction Problem
We consider a general problem of ABox abduction, called the query abduction problem, which is derived from (Cal-vanese et al. 2013) and is defined below.
Definition 2. Let = be a consistent DL ontology, Q be a BCQ and Σ be a set of concept names and role names (called abducible predicates). We call P = (T, A, Q, Σ) an instance of the query abduction problem. An explanation for P is a Σ-ABox E such that TAE= Q and TAE= . The set of explanations for P is denoted by expl(P).
A traditional task for ABox abduction (Klarman, Endriss, and Schlobach 2011; Du et al. 2011a) requires to compute all explanations with certain minimality. Since an explanation defined above may contain fresh individuals not in (i.e. neither in nor in Q), we cannot use standard set-inclusion minimality as in (Du et al. 2011a). To compare with expla¬nations containing different fresh individuals, we treat fresh individuals as variables. Analogously, we define a substitu¬tion for an explanation as a mapping from fresh individu¬als in to existing or fresh individuals, and a renaming for as a substitution for that maps different fresh individuals to different fresh individuals. Below we introduce a variant set-inclusion relation and the notion of minimal explanation.
Definition 3. For two explanations and ' for = (, , Q, Σ), by E' rE we denote that there exists a re-naming ρ for E' such that E' ρ E. A minimal explanation E for P is an explanation for P such that there is no explana¬tion E' for P fulfilling E' rE. The set of different minimal explanations for P up to renaming of fresh individuals (sim¬ply up to renaming) is denoted by mexpl(P).
As discussed in the first section, mexpl() can have in¬finitely many explanations for . However, when is re¬stricted to be first-order rewritable, mexpl() will become a finite set. This conclusion is drawn from Lemma 1, where a bipartition of a BCQ Q is a tuple of two BCQs (Q1, Q2) such that Q1 Q2 = and Q1 Q2 = Q.
Lemma 1. For every minimal explanation for = (, , Q, Σ), there exists a BCQ Q' τ(Q, TD), a bipartition (Q1, Q2) of Q', a ground substitution 0 for Q2 and a ground substitution σ for Q10 such that E = Q10 σ, Q20 A, TAQ10 σ = .
Let 'T ,A,Σ (Q) denote the set Q10 Q' τ(Q, TD), (Q1, Q2) is a bipartition of Q', 0 is a ground substitution for Q2 such that Q1 contains only predicates in Σ, Q20 A, TAQ10 = . Lemma 1 shows that mexpl(P) is a sub¬set of Eσ E 'T,A,Σ(Q), σ is a ground substitution for E up to renaming. Since the number of BCQs in τ(Q, TD), the number of bipartitions of a BCQ in τ(Q,TD) and the number of different explanations of the form Eσ for an ele¬ment E 'T ,A,Σ (Q) are finite while τ(Q, TD) is indepen¬dent from A, the cardinality of mexpl(P) is finite and is only polynomial in the cardinality of A under an assumption that the size of T and the size of Q are constants. Moreover, by Condition (2) in Definition 1, TA Q10 = if and only if there is no Q' γ (TC TE, TD) such that AQ10 = Q'. Hence the consistency checking for TAQ10 can be done in polynomial time in data complexity, and so can mexpl(P) be computed. However, mexpl(P) can still be too large to be computed, as shown in the first section. To reduce the num¬ber of explanations that need to be computed, below we in¬troduce a subsumption relation among explanations and the notion of representative explanation.
Definition 4. A explanation for is said to be subsumed by another explanation ' for , denoted by ' 3, if there is a substitution 0 for ' such that '0 ; it is said to be strictly subsumed by ', denoted by ' 3, if ' 3 and 3'. A representative explanation for is a minimal explanation for such that there is no minimal explanation ' for fulfilling ' 3. The set of different representative explanations for up to renaming is denoted by rexpl().
An example for minimal explanations and representative explanations is given below.
Example 1. Let = be a DL ontology. The TBox
consists of the following three axioms:
α1 : Student Person α2 : Student Employee
α3 : worksFor. Employee
The ABox consists of the following two assertions:
α4 : Person(Tom) α5 : Student(Amy)
Suppose the set Σ of abducible predicates is Student, Employee, worksFor. For the BCQ Q1 = Person(Tom), worksFor(Tom, x), there are three minimal explana¬tions for (T, A, Q1, Σ), namely worksFor(Tom, Tom), worksFor(Tom, Amy) and worksFor(Tom, u) where u denotes a fresh individual, among which the last one is the unique representative explanation for (T, A, Q1, Σ). For the BCQ Q2 = Person(Amy), worksFor(Amy, x), there is no minimal explanation for (T, A, Q2, Σ), because a min¬imal explanation should contain an assertion of the form worksFor(Amy, a) which is inconsistent with α2, α3, α5.
We propose to compute rexpl() instead of mexpl() be¬cause the cardinality of rexpl() can be much smaller than that of mexpl(), and as shown in Theorem 1, mexpl()
1036
can be retrieved from rexpl() by substituting fresh individ¬uals, performing consistency checks and -checks, and by deleting duplicate explanations up to renaming.
Theorem 1. For a set S of explanations for , let reduce(S) denote the set of explanations obtained from S there is no S such that by deleting all duplicate explanations up to renaming. We have mexpl() = reduce(0 rexpl(), 0 is a substitu¬tion for such that 0 = ) up to renaming.
Computing all Representative Explanations
Given = (, , Q, Σ), the explanations for P can be computed from the aforementioned set E-TAΣ(Q) = Q10 Q τ(Q, TD), (Q1, Q2) is a bipartition of Q, 0 is a ground substitution for Q2 such that Q1 contains only predicates in Σ, Q20 A, TA Q10 = by applying ground substitutions, as shown in the following lemma.
Lemma 2. Let E be an element in E-Σ(Q) and σ a ground substitution for E, then Eσ is explanation for P if and only if TA Eσ = .
However, to efficiently compute all representative expla¬nations for , it is unwise to enumerate all ground substitu¬tions for elements in E-Σ (Q). Actually, we can only con¬sider a small portion of ground substitutions for elements in E-Σ(Q). We call a ground substitution σ for a BCQ Q a fresh substitution for Q in P if it only maps variables in Q to fresh individuals not in P.
By ΓTAΣ(Q) we denote the set obtained from EσE E-TAΣ(Q), σ is a fresh substitution for E in P such that TA Eσ = by deleting all duplicate elements up to renaming. By Lemma 2, all elements in ΓTAΣ (Q) are explanations for P. The following lemma shows that all rep¬resentative explanations for P can be found in ΓTAΣ (Q).
Lemma 3. For an arbitrary rexpl(), there exists a renaming ρ for E such that Eρ ΓTAΣ (Q).
The set ΓTAΣ (Q) may contain explanations for P that are not representative. The following two lemmas show that the set of representative explanations for P can be ob¬tained from ΓTAΣ(Q) by dropping non-minimal explana¬tions and non-representative explanations in turn, where the r-checks and the s-checks are only applied to explana¬tions in ΓTAΣ(Q).
Lemma 4. For any ΓTAΣ (Q), E is a minimal expla¬nation for P if and only if there is no E ΓTAΣ (Q) such that ErE.
Lemma 5. For any reduce(ΓTAΣ(Q)) where reducer(S) is defined in Theorem 1, E is a representa-tive explanation for P if and only if there is no Ereducer(ΓTAΣ(Q)) such that EsE.
By Lemma 3, Lemma 4 and Lemma 5, we obtain a direct method for computing rexpl() without computing mexpl() beforehand. The method together with its sound¬ness and completeness are formally shown in the following theorem.
Theorem 2. For a set S of explanations for , let reduce(S) denote the set of explanations obtained from
S there is no S such that by deleting all duplicate explanations up to renaming. We have rexpl() = reduce(reduce(ΓΣ(Q))) up to renaming.
It should be mentioned that the intermediate step, namely dropping non-minimal explanations from ΓTAΣ(Q), is crucial in guaranteeing the soundness of the above method. We show this by the following example.
Example 2. Let = be a DL ontology, where the TBox consists of the following two axioms α1 and α2, and the ABox A has only the following assertion α3.
α1 : Student hasJob. Parttime
α2 : hasJob. Worker α3 : Student(Tom)
Suppose the set Σ of abducible predicates is hasJob. Con¬sider the problem of computing all representative explana¬tions for P = (T, A, Q, Σ), where Q = Parttime(Tom), Worker(Tom). It is clear that TD = T and TC = TE = . We assume that τ(Q, TD) = Q where Q = Student(Tom), hasJob(Tom, x), hasJob(Tom, y). This assumption is reasonable since for an arbitrary ABox A, we have TDA= Q if and only if A= Q. Then E-TAΣ (Q) has a single element hasJob(Tom, x), hasJob(Tom, y). By applying fresh substitutions for this element, we ob¬tain two explanations E = hasJob(Tom, u) and E = hasJob(Tom, u1), hasJob(Tom, u2) in ΓTAΣ(Q). It can be seen that EsE, EsE and ErE. Hence, if we do not drop non-minimal explanations from ΓTAΣ (Q), we cannot prune E which is not a minimal explanation for P.
Consider the time complexity for the above method. As analyzed in the previous section, E-Σ(Q) has polyno-mially many elements and can be computed in polynomial time in terms of data complexity. By Condition (2) in Def¬inition 1, the consistency checks performed in the course of computing ΓTAΣ(Q) from E-TAΣ(Q) can be done in polynomial time too. Hence ΓTAΣ(Q) can be computed in polynomial time in terms of data complexity, and so can reduces(reducer(ΓTAΣ(Q))) be computed.
The above method can be efficiently implemented by a level-wise strategy, where the level number k is from 0 to the maximum cardinality of BCQs in τ(Q, TD), and in the kthlevel only elements in E-TAΣ(Q) whose cardinality equals to k are considered. This strategy can often be highly effec¬tive in pruning explanations that are not minimal or not rep¬resentative. An example is given below for illustrating the above method with the level-wise strategy.
Example 3. Consider again and = (, , Q1, Σ) that are given in Example 1. It is not hard to see that T is a first-order rewritable TBox such that TD = α1, α2, TC = α3, TE = , γ(TCTE,TD) = Qand τ (Q1,TD) = Q1, Q2, where Q = Student(x), Employee(x), Q1 = Person(Tom), worksFor(Tom, x)and Q2 = Student(Tom), worksFor(Tom, x). We have E-TAΣ(Q) = worksFor(Tom, x), Student(Tom), worksFor(Tom, x). Let S be a set storing representative explanations for P. Initially, S is set as empty. We cope with every element in E-TAΣ (Q) in a level-wise man-ner, where the level number is from 0 to 2. In Level 0, there is no element to be handled. In Level 1, we handle
1037
Table 1: The characteristics of test ontologies
Ontology #C #R #TA #AA #I
Semintec 60 16 203 65,240 17,941
Vicodi 194 12 223 116,181 33,238
LUBM1 -100 43 32 88 100,543
13,824,437 17,174
2,179,766
Note: #C/#R/#TA/#AA/#I is the number of concept names/ role names/TBox axioms/ABox assertions/individuals.
{worksFor(Tom, x)}. By applying fresh substitutions for it, we get £1 = {worksFor(Tom, u)}, where u denotes a fresh individual. Since A £1 |= Q', we have T A £1 |= and thus £1 can be in reduces(reducer(ΓT ,A,Σ(Q))). We add £1 to S. In Level 2, we handle {Student(Tom), worksFor(Tom, x)}. By applying fresh substitutions for it, we get £2 = {Student(Tom), worksFor(Tom, u)}. Since £1 r £2, £2 cannot be in reduces(reducer(ΓT ,A,Σ(Q))). That is, £2 is pruned according to £1 without checking the consistency of T A £2. Finally, we obtain rexpl(P) = S = {£1}.
Experimental Evaluation
The proposed method was implemented in Java, using the Requiem (P´erez-Urbina, Motik, and Horrocks 2010) API for query rewriting and the MySQL engine to store and access ABoxes. Seven benchmark ontologies with large ABoxes were used. The first two are Semintec (about financial ser¬vices) and Vicodi (about European history). The remain¬ing ontologies are LUBMn (n = 1, 5,10,50, 100) from the Lehigh University Benchmark (Guo, Pan, and Heflin 2005), where n is the number of universities. These ontologies have TBoxes that are almost first-order rewritable and have been used to compare different DL reasoners (Motik and Sattler 2006) and to verify methods for ABox abduction (Du et al. 2011a). We removed some TBox axioms that Requiem can¬not handle from the above ontologies, making the remaining portions of TBoxes first-order rewritable. The characteristics of all test ontologies are reported in Table 1. All experiments were conducted on a laptop with Intel Dual-Core 2.20GHz CPU and 4GB RAM, running Windows 7, where the maxi¬mum Java heap size was set to 1GB.
The experiments are divided into two parts.
In the first part, we compared our proposed method with the Prolog-based method1 proposed in (Du et al. 2011a) on atomic BCQs that consist of single assertions. The Prolog-based method is the state-of-the-art method for computing all minimal explanations. Basically, it transforms a DL on¬tology to a Prolog program and applies a Prolog engine to compute explanations. For this part, we randomly gener¬ated thirty concept assertions as observations for each test ontology. In particular, for all LUBMn ontologies we used the same set of observations so as to verify the scalability against the increasing number n of universities. Each gener¬ated observation is not entailed by the test ontology, nor is the negation of it. Both methods work in two phases, namely
1http://dataminingcenter.net/abduction/
the preprocessing phase and the query phase. In the prepro¬cessing phase, the proposed method loads the TBox T and computes γ(TC TE, TD), while the Prolog-based method transforms the ontology to a Prolog program and loads it to the Prolog engine. In the query phase, both methods handle all observations one by one.
The proposed method always finishes the preprocessing phase in one second for all test ontologies. In contrast, the Prolog-based method spends twenty seconds to several hours in the preprocessing phase; it even runs out of mem¬ory when loading the transformed Prolog programs to the Prolog engine for LUBM10-100.
We focused more on the query phase and set a one-hour time limit to both methods for handling one observation. We first set all concept names but no role names as abducible predicates. The comparison results on the query phase are reported in Table 2. For all cases where the Prolog-based method does not run out of memory, the set of representa¬tive explanations computed by the proposed method is the same as the set of minimal explanations computed by the Prolog-based method. Moreover, the proposed method is much more efficient and more scalable against the increas¬ing number n of universities for LUBMn ontologies. The Prolog-based method is relatively inefficient, because it can¬not guarantee to compute a minimal explanation in polyno¬mial time in data complexity. We then set all concept names and all role names as abducible predicates. The superiority of the proposed method is more evident. The Prolog-based method only handles 27 observations for Vicodi, none of ob¬servations for Semintec and three observations for LUBM1 and LUBM5, without exceeding one hour for each observa-tion. It works so badly because there are often many minimal explanations for an observation when role names are used as abducible predicates. In contrast, the proposed method han¬dles every observation in 1.5 seconds (most in milliseconds) except for one observation which is over the concept name Student for each LUBMn ontology. It spends much time on the Student case because the number of representative explanations is large in this case: the proposed method com¬putes 55,491 representative explanations in 15 minutes on LUBM1 and exceeds one hour on LUBM5-100.
In the second part of experiments, we verified how well the proposed method works for general BCQs that may contain existentially quantified variables. We cannot com¬pare it with the Prolog-based method since the Prolog-based method does not fully support general BCQs. We randomly generated twenty BCQs as observations for each of the 14 benchmark conjunctive queries (CQs) company with LUBM (Guo, Pan, and Heflin 2005). Each BCQ is not entailed by any LUBMn ontology and was generated from a bench¬mark CQ by replacing the first variable with an individual and keeping other variables as existentially quantified vari-ables. We tested these BCQs on LUBMn ontologies with all concept names and all role names set as abducible predi¬cates. Let Gn denote the group of BCQs generated from the nth benchmark CQ. Figure 1 shows the average execution time for handling a BCQ in each group against the increas¬ing number n of universities. In the figure, part (a) shows the results about relatively easy groups in which BCQs can
1038
Table 2: The comparison results when only concept names are used as abducible predicates
The Proposed Method Prolog-based Method
Ontology avg.t max.t avg.#re avg.t max.t avg.#me
Semintec 21 72 4.0 1,808 2,650 4.0
Vicodi 9 34 6.3 5,211 21,580 6.3
LUBM1 44 403 2.4 901 5,570 2.4
LUBM5 43 375 2.4 8,412 52,438 2.4
LUBM10 44 371 2.4 running out of memory
LUBM50 44 388 2.4 running out of memory
LUBM100 45 382 2.4 running out of memory
Note: avg.t/max.t is the average/maximum execution time (in milliseconds) for handling an observation; avg.#me/avg.#re is the average number of minimal expla-nations/representative explanations for an observation.
Figure 1: The results for handling general BCQs in LUBMn
be handled in four seconds for any LUBMn ontology, while part (b) shows the results about other groups where the ex¬ecution time exceeding one hour is not shown. Nine out of the 14 groups are relatively easy. Especially, one BCQ in G3, G11 and G14 can be handled in a few milliseconds on average. All BCQs in G7 and all relatively hard groups ex¬cept G2 contain atoms over Student. These atoms have been shown to cause many representative explanations in the first part of experiments. The BCQs in G2 have also a number of representative explanations. In more detail, the average number of representative explanations for a BCQ in G2 is from 1,007 (on LUBM1) to 67,440 (on LUBM50). These results show that the proposed method can also efficiently compute representative explanations for general BCQs.
Related Work
Abductive reasoning is a new promising research area in the context of DLs (Elsenbroich, Kutz, and Sattler 2006), where three kinds of abductive problems have been studied. Be¬sides ABox abduction, the other two kinds are concept ab¬duction and TBox abduction. Concept abduction computes all concepts that are added as conjunctive components to a given satisfiable concept to make the concept subsumed by an observation (which is also a concept) and remain satis¬fiable. Some tableau-based methods for concept abduction have been proposed in (Noia, Sciascio, and Donini 2007; 2009), while the complexity for concept abduction has been
studied in (Bienvenu 2008). TBox abduction computes all sets of abducible axioms that are appended to a TBox to en¬force the entailment of an observation which is a concept in¬clusion axiom. An automata-based method for TBox abduc¬tion has been proposed in (Hubauer, Lamparter, and Pirker 2010). The methods for concept abduction or TBox abduc¬tion are not specifically designed for ABox abduction. There is no empirical evidence that these methods can be practi¬cally adapted to ABox abduction and are able to handle a large number of ABox assertions.
As mentioned in the first section, two methods for ABox abduction have been proposed in (Klarman, Endriss, and Schlobach 2011) and in (Du et al. 2011a), respectively. The latter has also been extended to handle the cases where ar¬bitrary concepts are used as abducible predicates (Du et al. 2012). The query abduction problem is a general problem of ABox abduction and is formally defined in (Calvanese et al. 2013), where the complexity for DL-Lite (a widely used DL in the DL-Lite family) is systematically studied but no method for computing all (minimal) explanations is given. As shown in (Du et al. 2011b), the method proposed in (Du et al. 2011a) can be adapted to the query abduction problem for computing some minimal explanations for a BCQ. How¬ever, it does not work in polynomial time in data complexity even for first-order rewritable TBoxes. Moreover, it is sig¬nificantly less efficient and less scalable than the proposed method here as shown by our experimental results.
Conclusion and Future Work
ABox abduction is an important reasoning mechanism for DL ontologies, but there is still a lack of efficient computa¬tional methods, especially for the query abduction problem which is a general problem of ABox abduction. In this paper we have addressed the computational aspect of this problem and made the following contributions. First, considering that the number of minimal explanations for a BCQ can be infi-nite, we identified first-order rewritable TBoxes that guar¬antee the existence of finitely many minimal explanations. Second, in order to reduce the number of explanations that need to be computed, we introduced representative expla¬nations from which minimal explanations can be retrieved. Third, we proposed a tractable method (in data complexity) for computing all representative explanations in a consis¬tent DL ontology whose TBox is first-order rewritable. Last but not least, we empirically showed that computing the set of representative explanations is much more practical than computing the set of minimal explanations; moreover, the proposed method is efficient and scalable for DL ontologies with large ABoxes.
There are at least three directions that can be explored in the future work. Firstly, the set of representative expla¬nations can still be large in some cases. It calls for specific treatments for these hard cases. Secondly, it is useful to de¬velop methods for checking whether an arbitrary TBox is first-order rewritable. Lastly, besides the class of first order rewritable TBoxes, it is important to identify other classes of TBoxes or some classes of BCQs that have finitely many minimal explanations for the query abduction problem.
1039
Acknowledgements
This work is partly supported by NSFC grants (61375056, 61005043 and 61379043), Business Intelligence Key Team of Guangdong University of Foreign Stud¬ies (TD1202), Guangdong Natural Science Founda¬tion (S2013010012928), China National 973 project (2014CB340301) as well as ARC (DP110101042 and DP130102302).
References
Baader, F.; Calvanese, D.; McGuinness, D. L.; Nardi, D.; and Patel-Schneider, P. F., eds. 2003. The Description Logic Handbook: Theory, Implementation, and Applica-tions. Cambridge University Press.
Bada, M.; Mungall, C.; and Hunter, L. 2008. A call for an abductive reasoning feature in owl-reasoning tools toward ontology quality control. In Proceedings of the 5th OWLED Workshop on OWL: Experiences and Directions.
Bienvenu, M. 2008. Complexity of abduction in the family of lightweight description logics.
Borgida, A.; Calvanese, D.; and Rodriguez-Muro, M. 2008. Explanation in the DL-Lite family of description log¬ics. In Proceedings of the 7nd International Conference On Ontologies, DataBases, and Applications of Semantics (ODBASE), 1440–1457.
Cal`ı, A.; Gottlob, G.; and Lukasiewicz, T. 2012. A general datalog-based framework for tractable query answering over ontologies. Journal of Web Semantics 14:57–83.
Cal`ı, A.; Gottlob, G.; and Pieris, A. 2012. Towards more ex¬pressive ontology languages: The query answering problem. Artificial Intelligence 193:87–128.
Calvanese, D.; Giacomo, G.; Lembo, D.; Lenzerini, M.; and Rosati, R. 2007. Tractable reasoning and efficient query an¬swering in description logics: The DL-Lite family. Journal of Automated Reasoning 39(3):385–429.
Calvanese, D.; Ortiz, M.; Simkus, M.; and Stefanoni, G. 2013. Reasoning about explanations for negative query an¬swers in DL-Lite. Journal of Artificial Intelligence Research 48:635–669.
Du, J.; Qi, G.; Shen, Y.; and Pan, J. Z. 2011a. Towards practical ABox abduction in large OWL DL ontologies. In Proceedings of the 25th National Conference on Artificial Intelligence (AAAI), 1160–1165.
Du, J.; Wang, S.; Qi, G.; Pan, J. Z.; and Hu, Y. 2011b. A new matchmaking approach based on abductive conjunctive query answering. In Proceedings of the 1st Joint Interna¬tional Semantic Technology Conference (JIST), 144–159.
Du, J.; Qi, G.; Shen, Y.; and Pan, J. Z. 2012. Towards practi¬cal ABox abduction in large description logic ontologies. In¬ternational Journal on Semantic Web and Information Sys¬tems 8(2):1–33.
Du, J.; Wang, K.; and Shen, Y. 2014. A tractable approach to ABox abduction over description logic ontologies. Tech¬nical report, Guangdong University of Foreign Studies. http: //www.dataminingcenter.net/abduction/AAAI14-TR.pdf.
Eiter, T., and Gottlob, G. 1995. The complexity of logic-based abduction. Journal of the ACM 42(1):3–42.
Elsenbroich, C.; Kutz, O.; and Sattler, U. 2006. A case for abductive reasoning over ontologies. In Proceedings of the 3rd OWLED Workshop on OWL: Experiences and Direc¬tions.
Guo, Y.; Pan, Z.; and Heflin, J. 2005. LUBM: A bench-mark for OWL knowledge base systems. Journal of Web Semantics 3(2–3):158–182.
Hubauer, T.; Lamparter, S.; and Pirker, M. 2010. Automata-based abduction for tractable diagnosis. In Proceedings of the 23rd International Workshop on Description Logics.
Klarman, S.; Endriss, U.; and Schlobach, S. 2011. ABox ab¬duction in the description logic . Journal of Automated Reasoning 46(1):43–80.
Motik, B., and Sattler, U. 2006. A comparison of reason-ing techniques for querying large description logic aboxes. In Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence, and Reason¬ing (LPAR), 227–241.
Noia, T. D.; Sciascio, E. D.; and Donini, F. M. 2007. Seman¬tic matchmaking as non-monotonic reasoning: A description logic approach. Journal of Artificial Intelligence Research 29:269–307.
Noia, T. D.; Sciascio, E. D.; and Donini, F. M. 2009. A tableaux-based calculus for abduction in expressive descrip¬tion logics: Preliminary results. In Proceedings of the 22nd International Workshop on Description Logics.
P´erez-Urbina, H.; Motik, B.; and Horrocks, I. 2010. Tractable query answering and rewriting under description logic constraints. Journal of Applied Logic 8(2):186–209.
1040
In Proceedings of the International Conference on Intelligent Robotics and Systems (IROS 2006)
A System for Robotic Heart Surgery that Learns to
Tie Knots Using Recurrent Neural Networks
Hermann Mayer', Faustino Gomez2, Daan Wierstra2, Istvan Nagy', Alois Knoll', and J¨urgen Schmidhuber',2
1Department of Embedded Systems and Robotics, Technical University Munich, D-85748 Garching, Germany
{mayerh|nagy|knoll|schmidhu}@in.tum.de
2Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), CH-6928 Manno-Lugano, Switzerland
{tino|daan|juergen}@idsia.ch
Abstract— Tying suture knots is a time-consuming task per-formed frequently during Minimally Invasive Surgery (MIS). Automating this task could greatly reduce total surgery time for patients. Current solutions to this problem replay manually programmed trajectories, but a more general and robust ap-proach is to use supervised machine learning to smooth surgeon-given training trajectories and generalize from them. Since knot-tying generally requires a controller with internal memory to distinguish between identical inputs that require different actions at different points along a trajectory, it would be impossible to teach the system using traditional feedforward neural nets or support vector machines. Instead we exploit more powerful, recurrent neural networks (RNNs) with adaptive internal states. Results obtained using LSTM RNNs trained by the recent Evolino algorithm show that this approach can significantly increase the efficiency of suture knot tying in MIS over preprogrammed control.
I. INTRODUCTION
Minimally Invasive Surgery (MIS) has become common-place for an ever-growing number of procedures. Because MIS is performed through small incisions or ports in the patient’s body, tissue trauma, recovery time, and pain are reduced considerably compared to conventional, “open” surgery. While patients have profited enormously, surgeons have had to cope with reduced dexterity and perception: the instruments are long and have fewer degrees of freedom, force and tactile feedback are lost, and visual feedback is flattened to a 2D image. These factors make delicate maneuvers such as knot-tying very time-consuming. A laparoscopically tied suture knot can take up to three minutes to complete, compared to one second for a manually tied knot.
Robot-assisted MIS seeks to restore the feel of normal surgery by providing the surgeon with a more intuitive and ergonomic interface. The surgeon tele-operates a slave robot that manipulates the surgical instruments from a master con¬sole that provides full six degrees of freedom manipulation, enhanced 3D imaging, and often force feedback. Robotic surgical systems such as DaVinci [1] and ZEUS [2] are in wide use today, performing a variety of abdominal, pelvic, and thoracic procedures. However, despite significant advances in robot-assisted surgery, delicate tasks like knot-tying are still cumbersome and time-consuming, in some cases taking longer than with conventional MIS [3]. Given that knot-tying occurs frequently during surgery, automating this subtask would greatly reduce surgeon fatigue and total surgery time.
Building a good knot-tying controller is difficult because the 3D trajectories of multiple instruments must be precisely con¬trolled. There has been very little work in autonomous robotic
knot-tying: Kang et al. [4] devised a specialized stitching device, while Mayer et al. [5] were the first to tie a suture knot autonomously using general purpose laparoscopic instruments. In both approaches, the controller uses a hard-wired policy, meaning that it always repeats the same prescribed motion without the possibility of generalizing to unfamiliar instrument locations. One possible way to provide more robust control is to learn the control policy from examples of correct behavior, provided by the user.
The focus of this paper is on automating suture knot winding by training a recurrent neural network (RNN; [6]–[8]) on human generated examples. Unlike standard non-recurrent machine learning techniques such as support vector machines and feedforward neural networks, RNNs have an internal state or short-term memory which allows them to perform tasks such as knot-tying where the previous states (i.e. instrument positions) need to be remembered for long periods of time in order to select future actions appropriately.
To date, the only RNN capable of using memory over sequences the length of those found in knot-tying trajec-tories (over 1000 datapoints), is Long Short-Term Memory (LSTM [9]). Therefore, our experiments use this powerful architecture to learn to control the movement of a real surgical manipulator to successfully tie a knot. Best results were ob¬tained using the recent hybrid supervised/evolutionary learning framework, Evolino [10], [11].
The next section describes the EndoPAR robotic system used in the experiments. In section III, we give a detailed account of the steps involved in laparoscopic knot tying. Section IV describes the Evolino framework, and in section V the method is tested experimentally in the task of autonomous suture knot winding.
II. THE ENDOPAR SYSTEM
The Endoscopic Partial-Autonomous Robot (EndoPAR) sys¬tem is an experimental robotic surgical platform developed by the Robotics and Embedded Systems research group at the Technical University of Munich (figure 1). EndoPAR consists of four Mitsubishi RV-6SL robotic arms that are mounted upside-down on an aluminum gantry, providing a cm×cm×cm workspace that is large enough for sur¬gical procedures. Although there are four robots, it is easy to access the workspace due to the ceiling mounted setup. Three of the arms are equipped with force-feedback instruments; the fourth holds a 3D endoscopic stereo camera.
Fig. 1. The EndoPAR system. The four ceiling mounted robots are
shown with an artificial chest on the operating table to test tele-operated and autonomous surgical procedures. Three of the robots hold laparoscopic gripper instruments, while the fourth manipulates an endoscopic stereo camera that provides the surgeon with images from inside the operating cavity. The size of the operating area (including gantry) is approximately 2.5m x 5.5m x 1.5m, and the height of the operating table is approximately 1 meter.
The position and orientation of the manipulators are con-trolled by two PHANToMTM Premium 1.5 devices from Sensable Inc. The user steers each instrument by moving a stylus pen that simulates the hand posture and feel of conventional surgical implements. The key feature of the PHANToM devices is their ability to provide force feedback to the user. EndoPAR uses a version of the PHANToM device that can display forces in all translational directions (no torque is fed back).
Figure 2 shows the sensor configuration used to implement realistic force feedback in the EndoPAR system. Each instru¬ment has four strain gauge sensors attached at the distal end of the shaft, i.e. near the gripper. The sensors are arranged in two full bridges, one for each principal axis. The signals from the sensors are amplified and transmitted via CAN-bus to a PC system where they are processed and sent to small servo motors that move the stylus to convey the sensation of force to the user. Since direct sensor readings are somewhat noisy, a smoothing filter is applied in order to stabilize the results.
Force feedback makes performing MIS more comfortable, efficient, safe, and precise. For knot-tying, this capability is es¬sential due to the fine control required to execute the procedure without breaking or loosing the thread [12]. As a result, the EndoPAR system provides an excellent platform with which to generate good training samples for the supervised machine learning approach explored in this paper.
III. MIS KNOT-TYING
Tying a suture knot laparoscopically involves coordinating the movements of three grippers through six steps. When the procedure begins, the grippers should be in the configuration depicted in figure 3A with the needle already having pierced the tissue (for safety, the piercing is performed manually by the surgeon). The next step (figure 3B) is to grasp the needle with gripper 1, and manually feed the thread to gripper 3,
Fig. 2. Force feedback. Forces are measured in the x and y directions (perpendicular to shaft). The upper part of the figure shows how the strain gauge sensors are arranged along the circumference of the shaft. Each diametrically opposed pair constitutes a full bridge of four resistors dedicated to one principal axis. Sensor signals are sent back to servo motors at the input stylus so that the surgeon can sense forces occurring at the gripper.
the assistant gripper, making sure the thread is taut. Gripper 1 then pulls the thread through the puncture (figure 3C), while gripper 3 approaches it at the same speed so that the thread remains under tension. Meanwhile, gripper 2 is opened and moved to the position where the winding should take place.
Once gripper 2 is in position, gripper 1 makes a loop around it to produce a noose (figure 3D). For this step it is very important that the thread be under the right amount of tension; otherwise, the noose around gripper 2 will loosen and get lost. To maintain the desired tension, gripper 3 is moved towards the puncture to compensate for the material needed for winding. Special care must be taken to ensure that neither gripper 1 nor the needle interfere with gripper 2 or the strained thread during winding.
After completing the loop, gripper 2 can be moved to get the other end of the thread (figure 3E). Once again, it is critical that the thread stay under tension by having grippers 1 and 3 follow the movement of gripper 2 at an appropriate speed. In figure 3F, gripper 2 has grasped the end of the thread. Therefore, gripper 1 must loosen the loop so that gripper 2 can pull the thread end through the loop. Gripper 3 can now loosen its grasp, since thread tension is no longer needed. Finally, grippers 1 and 2 can pull outward (away from the puncture) in order to complete the knot.
The knot-tying procedure just described has been automated successfully by carefully programming the movement of each gripper directly [5]. Programming gripper trajectories correctly is difficult and time-consuming, and, more importantly, pro¬duces behavior that is tied to specific geometric coordinates. The next section describes a method that can potentially provide a more generic solution by learning directly from human experts.
IV. EVOLINO
Recurrent Neural Networks (RNNs) are a powerful class of models that can, in principle, approximate any dynamical system [13]. This means that RNNs can be used to implement arbitrary sequence-to-sequence mappings that require memory.
Fig. 3. Minimally invasive knot-tying. (A) The knot-typing procedure starts with the needle and three grippers in this configuration. (B) Gripper 1 takes the needle, and the thread is fed manually to gripper 3. (C) The thread is pulled through the puncture, and (D) wound around gripper 2. (E) Gripper 2 grabs the thread between the puncture and gripper 3. (F) The knot is finished by pulling the end of the thread through the loop.
However, training RNNs with standard gradient descent tech¬niques is only practical when a short time window (less than 10 time steps) suffices to predict the correct system output. For longer time dependencies, the gradient vanishes as the error signal is propagated back through time so that network weights are never adjusted correctly to account for events far in the past [14].
Long Short-Term Memory (LSTM; [9], [15], [16]) over-comes this problem by using specialized, linear memory cells that can maintain their activation indefinitely. The cells have input and output gates that learn to open and close at ap¬propriate times, either to let in new information from outside and change the state of the cell, or to let activation out to affect other cells or the network’s output. This cell structure enables LSTM to learn long-term dependencies across almost arbitrarily long time spans. However, in cases where gradient is of little use due to numerous local minima, LSTM becomes less competitive (as in the case of learning gripper trajectories).
An alternative approach to training LSTM networks is the recently proposed Evolution of systems with Linear Outputs (Evolino; [10], [11]). Evolino is a framework for supervised sequence learning that combines neuroevolution [17] (the evolution of artificial neural networks) for learning the recur¬rent weights, with linear regression for computing the output weights (see figure 4).
During evolution, Evolino networks are evaluated in two
phases. In the first phase, the network is fed the training sequences (e.g. examples of human-performed knot-tying), and the activations of the memory cells are saved at each time step. At this point, the network does not have connections to its outputs. Once the entire training set has been seen, the second phase begins by computing the output weights analytically using the pseudoinverse. The training set is then fed to the network again, but now the network propagates the input all the way through the new connections to produce an output signal. The error between the output and the correct (target) values is used as a fitness measure to be minimized by evolutionary search.
The particular instantiation of Evolino in this paper uses the Enforced SubPopulations algorithm (ESP; [18], [19]) to evolve LSTM networks. Enforced SubPopulations differs from standard neuroevolution methods in that instead of evolving complete networks, it coevolves separate subpopulations of network components or neurons (figure 4).
ESP searches the space of networks indirectly by sam-pling the possible networks that can be constructed from the subpopulations of neurons. Network evaluations provide a fitness statistic that is used to produce better neurons that can eventually be combined to form a successful network. This cooperative coevolutionary approach is an extension to Symbiotic, Adaptive Neuroevolution (SANE; [20]) which also evolves neurons, but in a single population. By using separate
LSTM Network
Fig. 4. Evolino. The figure shows the three components of the Evolino implementation used in this paper: the Enforced SubPopu-lations (ESP) neuroevolution method, the Long Short-Term Memory (LSTM) network architecture (shown with four memory cells), and the pseudoinverse method to compute the output weights. When a network is evaluated, it is first presented the training set to produce a sequence on network activation vectors that are used to compute the output weights. Then the training set is presented again, but now the activation also passes through the new connections to produce outputs. The error between the outputs and the targets is used by ESP as a fitness measure to be minimized.
subpopulations, ESP accelerates the specialization of neurons into different sub-functions needed to form good networks because members of different evolving sub-function types are prevented from mating. Subpopulations also make the neuron fitness evaluations less noisy because each evolving neuron type is guaranteed to be represented in every network that is formed. Consequently, ESP is able to evolve recurrent networks more efficiently than SANE.
Evolino does not evolve complete networks but rather evolves networks that produce a set of activation vectors that form a non-orthogonal basis from which an output mapping can easily be computed. The intuition is that it is often easier to find a sufficiently good basis than to find a network that models the target system directly. Evolino has been shown to outperform gradient-based methods on continuous trajectory generation tasks [10]. Unlike gradient-based methods, it has the ability to escape local minima due to its evolutionary com¬ponent. Moreover, it is capable of generating precise outputs by using the pseudoinverse, which computes an optimal linear mapping. Previous work with Evolino has concentrated on comparisons with other methods in rather abstract benchmark problems, such as the Mackey-Glass time-series. This paper
Fig. 5. Training the knot winding networks. LSTM networks are trained on a set of recordings that sample the position of gripper 1 at 0.1mm increments during a human-controlled suture knot. The figure shows three such training sequences; the one with the thicker path shows the sample points that the network uses as input and targets. For each training sequence, the network receives the (x, y, z)-position of the gripper 1, and outputs a prediction of the distance to the next position of the gripper (i.e. the next sample in the sequence). The prediction is added to the input and compared to the correct (target) next position to produce an error signal that is used either for gradient descent learning, or as a fitness measure for Evolino, after all the training sequences have been processed.
presents the first application of Evolino to a real-world task. V. EXPERIMENTS IN ROBOTIC KNOT WINDING
Our initial experiments focus on the most critical part of suture knot-tying: winding the suture loop (steps C through F in figure 3). While the loop is being wound by gripper 1, gripper 2 stays fixed. Therefore, networks were trained to control the movement of gripper 1.
A. Experimental Setup
LSTM networks were trained using a database of 25 loop trajectories generated by recording the movement of gripper 1 while a knot was being tied successfully using the PHANToM units. Each trajectory consisted of approximately 1300 gripper (x, y, z)-positions measured at every 0.1mm displacement, {(xj 1, yj1, zj1), . . . , (xjl;, yjl;, zjl;)}, j = 1..25, where lj is the length of sequence j. At each step in a training sequence, the network receives the coordinates of gripper 1 through three input units (plus a bias unit), and computes the desired dis¬placement (Δx, Δy, Δz) from the previous position through three output units.
Both gradient descent and Evolino were used in 20 exper¬iments each to train LSTM networks with 10 memory cells. The Evolino-based networks were evolved for 60 generations with a population size of 40, yielding a total of 3580 evalua¬tions (i.e. passes through the training set) for each experiment.
Figure 5 illustrates the procedure for training the networks. For the gradient descent approach, the LSTM networks were trained using Backpropagation Through Time [6] where the network is unfolded once for each element in the training
sequence to form an lj-layer network (for sequence j) with all layers sharing the same weights. Once the network has seen that last element in the sequence, the errors from each time-step are propagated back through each layer as in standard backpropagation, and then the weights are adjusted.
For Evolino-trained LSTM, each network is evaluated in two phases (see section IV). In the first phase the activations of the network units are recorded, but no outputs are produced as, at this point, the network does not have output connections. After the entire training set has been seen, the output connec¬tions are computed using the pseudoinverse. In the second phase, the network produces control actions that are used to calculate the fitness of the network.
The error (fitness) measure used for both methods was the sum-squared difference between the network output plus the previous gripper position and the correct (target) position for each time-step, across the 25 trajectories:
(xjt+Axtxjt+1)2+(ytj+Aytyjt+1)2+(ztj +Aztxjt+1)2
where Δxt, Δyt, and Δzt are the network outputs for each principal axis at time t which are added to the current position (xt, yt, zt) to obtain the next position. Note that because the networks are recurrent, the output can in general depend on all of the previous inputs.
For the first 50 time-steps of each training sequence, the network receives the corresponding sequence entry. After that, the network feeds back its current output plus its previous input as its new input for the next time-step. That is, after a washout time of 50 time-steps, the network makes predictions based on previous predictions, having no access to the training set to steer it back on course. This procedure allows the error to accumulate all along the trajectory so that minimizing it forces the network to produce loop trajectories autonomously (i.e. in the absence of a “teacher” input).
Once a network has learned the training set, it is tested in a 3D simulation environment to verify that the trajectories do not behave erratically or cause collisions. If the network passes this validation, it is transferred to the real robot where the procedure is executed inside the artificial rib-cage and heart mockup shown in figure 1. To tie the entire knot, a preprogrammed controller is used to start the knot, and then the network takes over for the loop, steps C through E. During the loop, the robot fetches a new displacement vector from the network every 7ms, and adds it to the current position of gripper 1. Gripper 2 remains stationary throughout this phase, and gripper 3 is moved away from the knot at a predefined rate to maintain tension on the thread. When the loop is complete, the control switches back to the program to close the knot. As in training, the winding network receives an initial “approaching sequence” of 50 points that control the robot to start the wind, and then completes the loop itself while feeding back its own outputs.
B. Experimental Results
Figure 6 shows the learning curve for the Evolino-trained LSTM networks. Each datapoint is the average error on the training set of the best network, measured in millimeters.
Generations
Fig. 6. Evolino learning curve. The plot shows the average error on the training set measured in millimeters for the best network in each generation, averaged over 50 runs. The vertical bars indicate one standard deviation from the average.
By generation 40, the error has reached a level that the networks can effectively produce usable loop trajectories. The gradient-trained LSTM networks were not able to learn the trajectories, so the error for this method is not reported. This poor performance could be due to the presence of many local minima in the error surface which can trap gradient-based methods.
Unlike gradient-based approaches, Evolino is an evolution¬ary method, and therefore is less susceptible to local minima. All of the 20 Evolino runs produced networks that could generate smooth loop trajectories. When tested on the real robot, the networks reliably completed the loop procedure, and did so in an average of 3.4 seconds, a speed-up of almost four times over the preprogrammed loop. This speed-up in knot winding results in a total time of 25.8 sec for the entire knot, compared to 33.7 sec for the preprogrammed controller.
Figure 7 shows the behavior of several Evolino-trained LSTM networks from the same run at different stages of evolution. As evolution progresses, the controllers track the training trajectories more closely while smoothing them. The network in the right-hand side of the figure was produced after approximately 4.5 hours of computation time.
These first results show that RNNs can be used to learn from training sequences of over one thousand time steps, and possibly provide useful assistance to expedite MIS procedures.
VI. DISCUSSION AND FUTURE WORK
An important advantage of learning directly from expert behavior is that it requires less knowledge about the system being controlled. Supervised machine learning can be used to capture and generalize expertise without requiring the often tedious and costly process of traditional controller design. The Evolino-trained LSTM networks in our experiments were able to learn from surgeons and outperform them on the real robot.
Our current approach only deals with the winding portion of the knot-tying task. Therefore, its contribution is limited by the efficiency of the other subtasks required to complete the full knot. In the future, we plan to apply the same basic approach used in this paper to other knot-tying subtasks (e.g. the thread tensioning performed by the assistant gripper, and knot
Generation 1 Generation 10 Generation 60
Fig. 7. Evolution of loop generating behavior. Each of the three 3D plots shows the behavior of the best network at a different generation during the same evolutionary run. All axes are in millimeters. The dark curve is the trajectory generated by the network; the lighter curve is the target trajectory. Note that the network is being tested to reproduce same target trajectory in each plot. The best network in the first generation tracks the target trajectory closely for the first 15mm or so, but diverges quickly when the target turns abruptly. By the tenth generation, networks can form smooth loops, and by generation 60, the network tracks the target throughout the winding, forming a tight, clean loop.
tightening) that are currently implemented by programmed controllers. The separate sub-controllers can then be used in sequence to complete the whole procedure.
The performance of automated MIS need not be constrained by the proficiency of available experts. While human surgeons provide the best existing control, more optimal strategies may be possible by employing reinforcement learning tech¬niques where target trajectories are not provided, but instead some higher-level measure of performance is maximized. Approaches such as neuroevolution could be used alone, or in conjunction with supervised learning to bootstrap the learning. Such an approach would first require building a simulation environment that accurately models thread physics.
VII. CONCLUSION
This paper has explored the application of supervised learn¬ing techniques to the important task of automated knot-tying in Minimally Invasive Surgery. Long Short-Term Memory neural networks were trained to produce knot winding trajectories for a robotic surgical manipulator, based on human-generated examples of correct behavior. Initial results using the Evolino framework to train the networks are promising: the networks were able to perform the task on the real robot without access to the teaching examples. These results constitute the first successful application of supervised learning to MIS knot-tying.
ACKNOWLEDGMENTS
This research was partially funded by SNF grant 200020-107534 and the EU MindRaces project FP6 511931.
REFERENCES
[1] G. Guthart and J. K. S. Jr., “The IntuitiveTM telesurgery system: Overview and application,” in International Conference on Robotics and Automation (ICRA), 2000, pp. 618–621.
[2] A. Garcia-Ruiz, N. Smedira, F. Loop, J. Hahn, C. Steiner, J. Miller, and
M. Gagner, “Robotic surgical instruments for dexterity enhancement in thorascopic coronary artery bypass graft,” Journal of Laparoendoscopic and Advanced Surgical Techniques, vol. 7, no. 5, pp. 277–283, 1997.
[3] A. Garcia-Ruiz, “Manual vs robotically assisted laparoscopic surgery in the performance of basic manipulation and suturing tasks,” Archives of Surgery, vol. 133, no. 9, pp. 957–961, 1998.
[4] H. Kang, “Robotic assisted suturing in minimally invasive surgery,” Ph.D. dissertation, Rensselaer Polytechnic Institute, Troy, New York, May 2002.
[5] H. Mayer, I. Nagy, A. Knoll, E. U. Schirmbeck, and R. Bauernschmitt, “The EndoPAR system for minimally invasive surgery,” in International Conference on Intelligent Robots and Systems (IROS), Sendai, Japan, 2004.
[6] P. Werbos, “Backpropagation through time: what does it do and how to do it,” in Proceedings of IEEE, vol. 78, 1990, pp. 1550–1560.
[7] A. J. Robinson and F. Fallside, “The utility driven dynamic error propagation network,” Cambridge University Engineering Department, Tech. Rep. CUED/F-INFENG/TR.1, 1987.
[8] R. J. Williams and D. Zipser, “A learning algorithm for continually running fully recurrent networks,” Neural Computation, vol. 1, no. 2, pp. 270–280, 1989.
[9] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[10] J. Schmidhuber, D. Wierstra, and F. Gomez, “Evolino: Hybrid neuroevo-lution/optimal linear search for sequence learning,” in Proceedings of the 19th International Joint Conference on Artificial Intelligence, 2005.
[11] D. Wierstra, F. Gomez, and J. Schmidhuber, “Modeling non-linear dynamical systems with evolino,” in Proceedings of the Genetic Evo-lutionary Computation Conference (GECCO-05). Berlin; New York: Springer-Verlag, 2005.
[12] I.Nagy, H. Mayer, A. Knoll, E. Schirmbeck, and R. Bauernschmitt, “EndoPAR: An open evaluation system for minimally invasive robotic surgery,” in IEEE Mechatronics and Robotics 2004 (MechRob), Aachen, Germany, September 2004.
[13] H. T. Siegelmann and E. D. Sontag, “Turing computability with neural nets,” Applied Mathematics Letters, vol. 4, no. 6, pp. 77–80, 1991.
[14] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, “Gradient flow in recurrent nets: the difficulty of learning long-term dependencies,” in A Field Guide to Dynamical Recurrent Neural Networks, S. C. Kremer and J. F. Kolen, Eds. IEEE Press, 2001.
[15] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” Neural Computation, vol. 12, no. 10, pp. 2451–2471, 2000.
[16] F. A. Gers and J. Schmidhuber, “LSTM recurrent networks learn simple context free and context sensitive languages,” IEEE Transactions on Neural Networks, vol. 12, no. 6, pp. 1333–1340, 2001.
[17] X. Yao, “Evolving artificial neural networks,” Proceedings of the IEEE, vol. 87, no. 9, pp. 1423–1447, 1999.
[18] F. Gomez and R. Miikkulainen, “Solving non-Markovian control tasks with neuroevolution,” in Proceedings of the 16th International Joint Conference on Artificial Intelligence. Denver, CO: Morgan Kaufmann, 1999. [Online]. Available: http://nn.cs.utexas.edu/keyword?gomez:ijcai99
[19] F. J. Gomez and R. Miikkulainen, “Active guidance for a finless rocket using neuroevolution,” in Proc. GECCO 2003, Chicago, 2003, Winner of Best Paper Award in Real World Applications.
[20] D. E. Moriarty and R. Miikkulainen, “Efficient reinforcement learning through symbiotic evolution,” Machine Learning, vol. 22, pp. 11–32, 1996.
Sambit attac arya
Curriculum Vitae
310 lliance Circle
Cary, C, 27519, S
P one: 910 401-2079 Residence
910 672-1156 or
Email: sb attac uncfsu.edu
eb: tt ://faculty.uncfsu.edu/sb attac/
EDUCATION:
• P D Com uter Science and Engineering
State niversity of e or at uffalo, uffalo,
Se tember 2005
esis Researc : nalysis of Multidimensional Microsco ic Images.
• MS Com uter Science and Engineering
State niversity of e or at uffalo, uffalo,
Se tember 2004
Researc Pro ect: lication of Polygonal Curve Sim lification to Contouring of
Ob ects in Images.
• M. ec Com uter Science and Data Processing
Indian Institute of ec nology, arag ur, India
une 1999
esis Researc : Determination of Mec anical Parameters of Materials using
Mat ematical Modeling and Digital Image Processing ec ni ues.
• MSc Integrated 5 ears in P ysics
Indian Institute of ec nology, an ur, India
May 1996
Researc Pro ect: lication of aser O tics in Differentiation of iological issue.
EXPERIENCE:
all 2012 - Present ssociate Professor, De artment of Mat ematics and
Com uter Science,
ayetteville State niversity
all 2005 - all 2012 ssistant Professor, De artment of Mat ematics and
Com uter Science,
ayetteville State niversity
all 2003 – S ring 2005 Researc ssistant, De artment of iological Sciences,
State niversity of e or at uffalo
Sambit attac arya Curriculum Vitae 2
all 1999 - Summer Summer ecturer, Part-time lecturer and eac ing
2003 ssistant, De artment of Com uter Science,
State niversity of e or at uffalo
REFEREED RESEARCH PUBLICATIONS:
oo c a ters:
1. attac arya, S., C e do, ., Han, S. and Siddi ue, M.
e avior ased roac for Robot avigation and C emical nomaly rac ing . In Sabina esc e, Hong ai iu and Daniel Sc ilberg Ed. , Intelligent Robotics and
lications, Series: ecture otes in Com uter Science CS vol. 7102, . 317
327, 2011 . Publis er: S ringer.
2. C e do, . D., attac arya, S., and as un, M.
se of Multi-level State Diagrams for Robot Coo eration in an Indoor Environment .
In Vaclav Snasel and an Platos and Eyas El-Qa asme Ed. , Digital Information
Processing and Communications vol. 189, . 411-425, 2011 . Publis er: S ringer.
3. at t e robot sees, at t e uman feels: Robotic face detection and t e uman emotional res onse. Studies in brain, face, and emotion.
Montoya, Daniel a er-Oglesbee, lissa attac arya, Sambit
Emotional e ression: e brain and t e face Vol 3 .Studies in brain, face, and
emotion. . 43-71 Porto, Portugal: Edi es niversidade ernando Pessoa reitas
Magal es , . Ed , 2011 . v, 265 .
ournal articles:
4. Pliss, ., rit , . ., Sto ovic, ., Ding, H., Mu er ee, ., attac arya, S., ... ere ney, R. 2014 . on-random Patterns in t e Distribution of OR-bearing
C romosome erritories in Human ibroblasts: et or Model of Interactions. ournal
of cellular ysiology. Available electronically ahead of print.
5. rit , . ., Sto ovic, ., Ding, H., u, ., attac arya, S., aile, D., ere ney, R.
2014. ide-scale alterations in interc romosomal organi ation in breast cancer cells: defining a net or of interacting c romosomes. Human molecular genetics, ddu237. Available electronically ahead of print.
6. Pliss , Malyavant am , attac arya S, and ere ney R.
C romatin dynamics in living cells: Identification of oscillatory motion.
ournal of Cellular P ysiology vol. 228 3 : 609-616. 2013
7. C e do, ., attac arya, S and erragut, E.
Query Processing for Probabilistic State Diagrams Describing Multi le Robot avigation
in an Indoor Environment .
International ournal of Digital Information and ireless Communications I DI C , Vol.
1 2 : 554-572 2012
Sambit attac arya Curriculum Vitae 3
8. Malyavant am .S., attac arya S. and ere ney, R.
“ e arc itecture of functional neig bor oods it in t e mammalian cell nucleus”. dvances in En yme Regulation, vol. 50, issue 1, . 126-134 2010
9. attac arya, S. and C e do, . D.
State Diagram Creation and Code eneration ool for Robot Programming .
ournal of Com uting Sciences in Colleges, Volume 25, Issue 3, . 120-127 2010 .
10. eit M. ., Mu er ee ., attac arya S., u . and ere ney R.
robabilistic model for t e arrangement of a subset of uman c romosome territories
in I38 Human fibroblasts .
ournal of Cellular P ysiology, vol. 221 1 , . 120-9 2009
11. Marella .V., attac arya S., Mu er ee ., u ., ere ney R.
Cell ty e s ecific c romosome territory organi ation in t e inter ase nucleus of
normal and cancer cells .
ournal of Cellular P ysiology, vol. 221 1 , . 130-8 2009
12. C e do, . D. and attac arya, S.
Programming robots it state diagrams .
ournal of Com uting Sciences in Colleges, Volume 24 , Issue 5, . 19-26 2009 .
13. Pliss , Malyavant am , attac arya S, eit M, ere ney R.
C romatin dynamics is correlated it re lication timing .
C romosoma, vol. 118 4, .459-70 2009
14. is ore S. Malyavant am, Sambit attac arya, Marcos arbeitos, o amudra
Mu er ee, in ui u, ran O. ac elmayer, Ronald ere ney.
“Identifying Functional Neighborhoods within the Cell Nucleus: Proximity Analysis of Early S-P ase Re licating C romatin Domains to Sites of ranscri tion, R
Polymerase II, HP1 , Matrin 3 and S - ”.
ournal of Cellular ioc emistry, vol. 105 2 , . 391-403 2008
15. is ore S. Malyavant am, Sambit attac arya, illiam D. lonso, Ra c arya,
Ronald ere ney.
“S atio-tem oral Dynamics of Re lication and ranscri tion Sites in t e Mammalian Cell
Nucleus”.
C romosoma, vol. 117 6, . 553-567 2008
16. R. ere ney, .S. Malyavant am, . Pliss, S. attac arya and R. c arya.
“Spatio-tem oral Dynamics of enomic Organi ation and unction in t e Mammalian Cell Nucleus”.
dvances in En yme Regulation, vol. 45, . 17-26 2005
Conference Proceedings:
17. attac arya, S., C eriyadat, . 2014 . Integrated nalysis of round evel and erial Image Data. IS 2014 Convention. Retrieved from: aisb50.org/.
18. attac arya, S., C e do, ., Mal oltra, R. 2014 . Infusing geos atial com uting in com uter science education. 52nd nnual CM Sout east Conference 2014. Retrieved
from: acmse14. ennesa .edu/inde . .
19. attac arya, S., C e do, ., Mal oltra, R., gra al, R. 2014 . Researc modules in geos atial com uting. 52nd nnual CM Sout east Conference 2014. Retrieved from:
acmse14. ennesa .edu/inde . .
Sambit attac arya Curriculum Vitae 4
20. nu, ., gra al, R., attac arya, S. 2014 . Ran ing tourist attractions using time
series PS data of cabs. IEEE Sout eastCon 2014. Retrieved from:
e .ieee.org/reg/3/sout eastcon2014/.
21. “Agent Based Modeling of Moving Point Objects in Geospatial Data,” Sambit attac arya, ogdan C e do, Ra es Mal otra, icolas Pere , Ra eev gra al, 4t
International Conference on Com uting for eos atial Researc lication, uly 22
24, 2013, San ose, C .
22. “Characterization of Moving Point Objects in Geospatial Data,” Sambit Bhattac arya, ogdan C e do, Ra es Mal otra, icolas Pere , Ra eev gra al, 4t International
Conference on Com uting for eos atial Researc lication, uly 22-24, 2013, San
ose, C .
23. “Analyzing Security reats as Re orted by t e nited States Com uter Emergency Readiness eam S-CERT),” Yolanda Baker, Rajeev Agrawal, Sambit Bhattacharya, IEEE Intelligence and Security Informatics ISI, une 4-7, 2013, Seattle as ington.
24. attac arya, S., C e do, . and Mal otra, R.
eos atial Intelligence as a Conte t for Com uting Education , abstract only . Proceeding of t e 44t CM tec nical sym osium on Com uter science education. Denver, Colorado, S , CM: 741-741.
25. attac arya, S., C e do, . D. and Pere , .
“ esture Classification it Mac ine earning using inect Sensor Data”. Proceedings
of t e ird International Conference on Emerging lications of Information
ec nology E I 2012 , . 348 – 351.
26. attac arya, S., C e do, . D. and Pere , .
“Researc Modules for ndergraduates in Mac ine earning for utomatic esture
Classification”. Proceedings of t e enty- ift International lorida rtificial Intelligence
Researc Society conference IRS-25, 2012 , . 318 – 321.
27. C e do, . D., attac arya, S., and C e do, .
se of Probabilistic State Diagrams for Robot avigation in an Indoor Environment .
Proceedings of t e nnual International Conference on dvanced o ics in rtificial
Intelligence I 2010 , . -97 to -102.
28. attac arya, S. and C e do, . D.
sing Robot Soccer ame o eac dvanced Com uter Science Conce ts .
Proceedings of t e nnual International Conference on Com uter Science Education:
Innovation ec nology CSEI 2010 , . I-205 to I-210.
29. attac arya, S., C e do, . D. and Mobley, S.
n Integrated Com uter Vision and Infrared Sensor ased roac to utonomous
Robot avigation in an Indoor Environment .
Proceedings of t e 2nd International Multi-Conference on Engineering and ec nology
Innovation, 2009, Volume III, . 42 – 47.
30. S. attac arya, R. c arya, . Pliss, .S. Malyavant am and R. ere ney.
“ Hybrid Registration roac for Matc ing enomic Structures in Multimodal
Microsco ic Images of iving Cells”.
Proceedings of t e 2008 International Conference on Image Processing, Com uter
Vision, and Pattern Recognition (IPCV’08), 2008, Volume II, . 217-221.
31. S. attac arya, R. c arya, . Pliss, .S. Malyavant am and R. ere ney.
“Com arison of Intensity ased Similarity Measures for Matc ing enomic Structures in
Microscopic Images of Living Cells”.
Sambit attac arya Curriculum Vitae 5
Proceedings of t e 28t nnual International Conference of t e IEEE Engineering in Medicine and iology Society EM S , 2006, . 3057-3061.
32. S. attac arya, R. c arya, . Pliss, . S. Malyavant am, and R. ere ney. “Automated Matching of Genomic Structures in Microscopic Images of Living Cells Using an Information Theoretic Approach”.
Proceedings of t e 27t nnual International Conference of t e IEEE Engineering in Medicine and iology Society EM S , 2005, . 6425-6428.
33. S. attac arya, . S. Malyavant am, R. c arya and R. ere ney.
“Fractal Analysis of Replication Site Images of the Human Cell Nucleus”.
Proceedings of t e 26t nnual International Conference of t e IEEE Engineering in Medicine and iology Society EM S , 2004, . 1443-1446.
34. M. in, . u, S. attac arya and R. ere ney.
“Surface Approximation Based Image Simplification and Applications”.
Proceedings of t e IEEE International Conference on Systems, Man and Cybernetics
SMC , 2004, . 2988-2993.
OTHER RESEARCH ARTICLES:
Internal S ite a ers:
1. attac arya, S., C e do, . and Han, S.
Construction of Portable E losives ano- iosensor System it Soft are nalysis
and ireless et or ing Ca abilities .
In- ouse ite a er ritten for t e Center for Defense and Homeland Security CDHS
of ayetteville State niversity. uly 2011.
2. attac arya, S., C e do, . and Han, S.
Indoor Mobile Robot System for C emical nomaly rac ing .
In- ouse ite a er ritten for t e Center for Defense and Homeland Security CDHS
of ayetteville State niversity. uly 2011.
AWARDED GRANTS:
1. Co-PI, “Strengt ening Com uter and Information Sciences Engagement and earning
SCISE ”, merican ssociation of Colleges and niversity C , $299,729.
unded from ugust, 2014 to ugust, 2017.
2. Co-PI, “ lications Sum-Product et or s to Recursive Structure earning”, Office of
aval Researc O R Summer aculty Researc Program osted at aval Researc aboratory R , around $16,000 for 10 ee s of summer, 2014.
3. Co-PI, “Probabilistic rame or for Modeling Interactions of Dynamic eos atial
Ob ects it Static andmar s”, Department of Energy (DOE grant for sti end and ot er
allo ances around $16,000 under t e Visiting aculty Program for 10 ee s of
summer, 2013 osted at Oa Ridge ational aboratory OR . it nil C eriyadat
OR and Rau Vatsavai OR as Co-PIs.
4. Co-PI, “Developing the Geospatial Intelligence Certificate at FSU”, National Geospatial
Intelligence gency , $443,463 it Ra es Mal otra PI and ogdan C e do
Co-PI. unded from ugust, 2012 to ugust, 2017. ard Id: HM01771210002.
Sambit attac arya Curriculum Vitae 6
5. PI, “MRI-R2: c uisition of Robots and Robot ccessories for Interdisci linary aculty and Student Researc at ayetteville State niversity”, NSF, Ma or Researc Instrumentation Program MRI-R Recovery and Reinvestment, $175,092 it Mic ael lmeida, ogdan C e do and Daniel Montoya as Co-PIs. unded from May 1st, 2010 to
ril 30t , 2013. ard Id: 0959958.
6. Co-PI, “Pre-College and ndergraduate res man Researc Mentoring O ortunities in
Mat ematics and Science ROMS ”, . S. De artment of Education grant, $583,263
it Daniel O unbor PI and Mu ammad od i Co-PI. unded from Se tember 1st,
2010 to ugust 31st, 2013. ard number: P120 100094.
7. Co-PI, “Im roving Minority Partici ation in S or force and S EM Educational
Pi eline”, ational eronautics and S ace dministration S , Curriculum
Im rovement Partners i ard for t e Integration of Researc into t e ndergraduate Curriculum CIP IR , 2010. unded from October 1st, 2010 to May 1st, 2011 for artial amount of $50,000 as a lanning grant it full ro osal to be re-submitted in 2011.
ritten it Daniel O unbor PI, ufang ao, o n ianc ini, ogdan C e do and Mu ammad od i Co-PIs . Pro osal number: 10-CIP IR2010-0004.
8. PI, S Robotics Summer Program Pro osal, submitted to t e out ro t Stoc rust administered by nited ay of Cumberland County, $13,000 it imberly Smit - urton and ogdan C e do as Co-PIs. unded for ugust 2009.
9. PI, “Pro osal to se Robots in eac ing an Introductory Programming Class and a Com uter Science o ics Class”, Institute for Personal Robotics in Education (IPRE), $10,000 it Mic ael lmeida Co-PI. unded from une 12t , 2008 to une 12t , 2009.
CONFERENCE, SEMINAR AND WORSKHOP PARTICIPATION:
1. ational eos atial-Intelligence gency cademic Researc Program RP
Sym osium and or s o s. Poster titled “Integration of Ground and Aerial Imagery for Geospatial Analysis”. ational cademy of Sciences ec Center, as ington DC. Dates: 09/10/2013 – 09/12/2013.
2. Institute for Mat ematics and its lications IM or s o on Imaging in eos atial lications. Minnea olis, Minnesota. Dates: 09/23/2013 – 09/26/2013. ttended.
3. ational eos atial-Intelligence gency and t e De artment of Homeland Security DHS or s o on Identification of ey no ledge a s in Social Media se
During Disasters. Head uarters, S ringfield. Virginia. Dates: 07/22/2013
07/23/2013. ttended.
4. SC Su er Com uting 2012. e International Conference for Hig Performance Com uting, et or ing, Storage and nalysis. iven an e ui ment grant of a ortable cluster com uter t e ittle e for curriculum develo ment. Salt a e City, ta . Dates: 11/10/2012 to 11/16/2012.
5. Talk titled “Human esture and ction Classification it O en Source Mac ine earning ools” given at the International Conference on dvances in Interdisci linary
Statistics and Combinatorics ISC in C reensboro. Date: 10/07/2012.
6. Talk titled “Recognition Segmentation of Human estures it Mac ine earning
lgorit ms” given at t e S Coast uard SC or s o on Human Performance
ec nology HP in Ham ton, V . Date: 09/14/2012.
7. Seminar talk titled “Robot Program Design on t e isac Platform and Vision ased Robot Localization” given to faculty and students of the Electronics, Computer & Information ec nology De artment in t e Sc ool of ec nology of ort Carolina
gricultural ec nical State niversity C . Date: 04/07/2011.
Sambit attac arya Curriculum Vitae 7
8. Technology Training Corporation’s Military Robotics Conference, ovember 3-4, 2010, San Diego, California. ttended.
9. CVPR 2010. e 23rd IEEE Conference on Com uter Vision and Pattern Recognition, une 13-18, 2010, San rancisco, California. ttended.
10. SI CSE 2010. e 41st CM ec nical Sym osium on Com uter Science Education Marc 10-13, 2010, Mil au ee, Wisconsin. Presented as panelist in the “Introducing Com uting it Personal Robots” workshop. Topic of presentation: “Robotics in the Com uter Science Curriculum at S – from Introductory Courses to Ca stone Projects”.
11. 25t nnual CCSC Eastern Conference, October 30 – 31, 2009, Villanova, Pennsylvania. Presented paper titled “A State Diagram Creation and Code eneration ool for Robot Programming .
12. rants Resource Centers E ternal unding Conference, riving on C ange, ugust 23rd - 26t , 2009, as ington, DC. ttended.
13. Sevent International Conference on Com uting, Communications and Control ec nologies, uly 10 – 13, 2009, Orlando, lorida. Presented a er titled n
Integrated Com uter Vision and Infrared Sensor ased roac to utonomous Robot
avigation in an Indoor Environment .
14. aculty and Student Sym osium in celebration of t e installation of Dr. ames .
nderson as S C ancellor titled e uture is Calling: Promoting Sc olars i and
E cellence at S , ril 2nd, 2009, ayetteville, C. al title: eac ing
ndergraduate Programming Course sing ra ical S ecifications for Robots .
15. Sevent nnual CCSC Mid-Sout Conference, ril 3 – 4, 2009, Martin, ennessee.
Presented a er titled Programming robots it state diagrams and attended.
16. SI CSE 2009. e 40t CM ec nical Sym osium on Com uter Science Education Marc 4-7, 2009, C attanooga, ennessee. ttended.
17. SPIE Medical Imaging, ebruary 7 – 12, 2009, Orlando, lorida. ttended.
18. e 2008 orld Congress in Com uter Science, Com uter Engineering and lied
Com uting, uly 14t to 17t , as Vegas, evada. Presented a er titled “ Hybrid
Registration roac for Matc ing enomic Structures in Multimodal Microsco ic
Images of Living Cells” in IPCV 2008 and ttended.
19. Com uter Science and Information ec nology Sym osium, une 28t , 2008, San ntonio, e as. ttended.
20. Minority Serving Institutions Researc Partners i MSIRP Conference, May 12t to
15t , 2008, e Orleans, . ttended.
21. ccreditation oard for Engineering and ec nology E aculty or s o on
ssessing Program Outcomes, ril 12t , 2008, int icum, MD. ttended.
22. S. Bhattacharya. “Using Visual Objects to Create Case Studied in Operating Systems”.
al and oster resentation at 8t nnual Conference on Case Study eac ing in
Science, October 5t to 6t , 2007, niversity at uffalo, State niversity of e or ,
uffalo, .
23. DoD Minority Serving Institutions Com utational Science and Engineering – Hig Performance Com uting aculty raining or s o , uly 23rd to 27t , 2007, ort Carolina gricultural and ec nical State niversity, reensboro, C. ttended.
24. C eac ing and earning it ec nology Conference, Marc 21st to 23rd, 2007, Raleig , C. ttended.
25. ccreditation oard for Engineering and ec nology E aculty or s o on
ssessing Program Outcomes, Marc 31st, 2007, int icum, MD. ttended.
26. 28t nnual International Conference of t e IEEE Engineering in Medicine and iology
Society EM S , ugust 30t to Se tember 3rd, 2006, e or City, . Presented
Sambit attac arya Curriculum Vitae 8
paper titled “Com arison of Intensity ased Similarity Measures for Matc ing enomic Structures in Microscopic Images of Living Cells”.
27. C eac ing and earning it ec nology Conference, Marc 15t to 17t , 2006, Raleig , C. ttended.
28. S. Bhattacharya. “Automated tracking of genomic structures in time-la se microsco y using a variant of t e iterative closest point algorithm”. Talk and poster presentation at
t e SE 2005 Summer Researc Conferences.
29. S. Bhattacharya, A. Pliss, K.S. Malyavantham, R. Acharya and R. Berezney. “Automated trac ing of genomic structures in time-la se microsco y using a variant of t e iterative closest point algorithm”. Poster presented at the Ninth Annual Buffalo DNA Replication and Re air Sym osium, niversity at uffalo, 2005.
30. 26t nnual International Conference of t e IEEE Engineering in Medicine and iology Society EM S , Se tember 1st to 5t , 2004, San rancisco, C . Presented a er titled “Fractal Analysis of Replication Site Images of the Human Cell Nucleus”.
31. S. attac arya, A. Pliss, K.S. Malyavantham, R. Acharya and R. Berezney. “Image Processing and Com utational lgorit ms for t e nalysis of Images of C romatin Domains in the Mammalian Cell Nucleus”. Poster presented at the Eighth Annual Buffalo
D Re lication and Re air Sym osium, niversity at uffalo, 2004.
32. S. Bhattacharya and R. Berezney. “Application of Image Processing Algorithms to the Analysis of the Dynamics of Chromatin Domains in Living Cells”. Talk and poster resentation at t e SE 2003 Summer Researc Conferences.
PRIMARY TEACHING INTERESTS:
• CS1 and CS2. anguages used or currently in use: Pyt on, ava, C , M .
• lgorit ms and Data Structures
• Programming anguages
• O erating Systems
• Com uter Organi ation and rc itecture
• Soft are ools
• Image Processing and Com uter Vision
• Mac ine earning
• ioinformatics
COURSE DEVELOPMENT ACTIVITIES:
1. Redesign of t e O erating Systems course it instructional modules develo ed for rocess and t reads management on multi-core and net or ed nodes. e ittle e ortable cluster received as grant from a or s o in t e Su er Com uting 2012 SC12 conference is used for laboratory e ercises and ro ects. all 2013.
2. s ecial to ics course titled Mobile lication Develo ment for lac erry
Smart one Devices. Su orted by an a ard of 10 lac berry Smart one devices and curriculum develo ment materials from Researc in Motion imited. irst offered in all 2011.
3. Com uter Organi ation and rc itecture II. e course added to CS curriculum in
S ring 2007 at S . Develo ed in bot regular and fully online course formats.
4. O erating Systems I. E isting regular course develo ed in fully online format in S ring 2007 at S .
Sambit attac arya Curriculum Vitae 9
PARTICIPATION IN UNDERGRADUATE RESEARCH ACTIVITIES:
1. Su ervised Com uter Science undergraduate student icolas Pere in researc . Pro ect title: “Precision of t e inect 3D Sensor for Distance and Human S eletal Data . e student resented results from t is ro ect in an oral resentation in t e State of ort Carolina ndergraduate Researc and Creativity Sym osium S C RCS , 2011.
2. Su ervised Com uter Science undergraduate student Marcus Mo r in researc . Pro ect
title: “Precision Movement of a Humanoid Robot O . e student resented a
oster on t is ro ect in t e State of ort Carolina ndergraduate Researc and Creativity Sym osium S C RCS , 2011.
3. Su ervised Com uter Science undergraduate student Sean Mobley in researc as art of t e C- S MP, 2008-2009 rogram in S . Pro ect title: Mobile Robot avigation
sing a Multi-Sensor roac . e student resented a oster on t is ro ect in t e
State of ort Carolina ndergraduate Researc and Creativity Sym osium
S C RCS , 2010.
4. Organi ed and resented or s o on iotec nology/Com utational iology for t e S -RISE rogram on 13t Se tember, 2008.
5. Su ervised Com uter Science undergraduate student in researc as art of t e C¬S MP, 2007-2008 rogram in S . Pro ect title: sing S etc u to Create a ree Dimensional Model of t e ayetteville State niversity Cam us . Student name: David ell r.
6. Organi ed and resented or s o on iotec nology/Com utational iology for t e S -RISE rogram on 27t October and 3rd ovember, 2007.
7. Organi ed and resented or s o on iotec nology/Com utational iology for t e S -RISE rogram on 18t ovember and 2nd December, 2006.
8. Su ervised Com uter Science undergraduate student during t e iomedical Summer
ndergraduate Researc E erience -S RE , 2003 rogram in S uffalo.
9. Su ervised Mat ematics undergraduate student during t e -S RE, 2002 rogram in
S uffalo.
MEMBERSHIPS AND SERVICES
Professional Memberships
1. Member of t e ssociation for Com uting Mac inery CM .
2. Member of IEEE Com uter Society.
Review Work
1. Public ibrary of Science P oS , O E tt :// . losone.org/ .
2. ational Science oundation S on ro osal revie anels. Revie ed eig t 8
ro osals. ears: 2011.
3. Consortium for Com uting Sciences in Colleges CCSC . ears: 2009 - 2012.
4. IEEE Engineering in Medicine and iology Society EM S . ears: 2005 - 2013.
Committees Served
1. Chancellor’s Faculty Advisory Committee on the Center for Defense and Homeland
Security CDHS ayetteville State niversity
2. aculty Senate ayetteville State niversity
Sambit attac arya Curriculum Vitae 10
3. Committee for Reading cross t e Curriculum ayetteville State niversity
4. Committee for lac board S ared Hosting C eneral ssembly
5. Committee for aculty Promotion and enure Revie Process ayetteville State niversity
6. Com uter Science Program ccreditation Committee ayetteville State niversity
7. Com uter Science e Programs Committee Master of Science in Information
ec nology ayetteville State niversity
8. Com uter Science Committee to Establis e Certification Related Electives. ayetteville State niversity
9. Com uter Science Online Course Develo ment Committee ayetteville State niversity
10. iberal rts Revie Committee ayetteville State niversity
11. oundations of E cellence ayetteville State niversity
12. Robotics Committee ayetteville State niversity
Community Services
1. Conducted t e S Summer Robotics or s o for middle and ig sc ool students,
ugust 3rd to 7t , 2009. unded by nited ay s S grant.
2. Presentation to ig sc ool students on Com uter Science for e Science House, C State niversity, uly 16t , 2009.
3. Mentored t o students for t e CPSER Center for Promoting S EM Education and Researc for ive ee Hig Sc ool Rising Senior Summer Researc Interns i, une 29t to uly 31st, 2009.
4. Partici ated in t e S Mat ematics and Science ndergraduate Researc
Sym osium, ril 24t , 2009.
5. Partici ated in t e S cademic air, Marc 21st, 2009 as boot re resentative for Com uter Science.
6. Presentation about Com uter Science and Robotics for t e S / CSC Mat and Science amily ig t, Marc 18t , 2009.
7. Served as boot re resentative for Com uter Science and Robotics at t e 4t nnual Parent Conference: or t e ove of C ildren , Marc 12t , 2009.
8. Presentation about robotics for t e S Hig Sc ool Mat and Science ig t, ovember 19t , 2008.
9. Partici ated in t e De artment of Mat ematics and Com uter Science o en ouse on
14t ril, 2007.
10. udge at ort Carolina Region IV Science air
ocation and Date: niversity of ort Carolina at Pembro e, ebruary 24t , 2007
11. udge at ort Carolina Region IV Science air
ocation and Date: niversity of ort Carolina at Pembro e, ebruary 18t , 2006
Computer Science and Engineering Dept., Graduate Student Association (CSEGSA) SUNY at Buffalo
1. Served as Vice President of CSE S
2. Served as raduate ffairs Committee C student member
Cipta — Penerapan Algoritma Evolving Neural Network Untuk Prediksi Curah Hujan
PENERAPAN ALGORITMA EVOLVING NEURAL
NETWORK UNTUK PREDIKSI CURAH HUJAN
Subhan Panji Cipta
Jurusan Teknik Informatika, STMIK Indonesia Banjarmasin
e-mail: Panji.cipta@gmail.com
ABSTRAK
Informasi iklim dan cuaca telah memberikan kontribusi sebagai salah satu pertimbangan bagi para pengambil keputusan. Hal ini muncul karena informasi cuaca / iklim memiliki nilai ekonomi dalam berbagai kegiatan, mulai dari pertanian ke pengendali banjir. Dari data yang diperoleh tersirat bahwa prediksi curah hujan saat ini tidak begitu akurat. Prakiraan sering diberikan ke publik secara teratur adalah ramalan cuaca, bukan jumlah curah hujan. Penelitian ini menggunakan algoritma Evolving Neural Network (ENN) sebagai pendekatan untuk memprediksi curah hujan. Pengolahan data dan perhitungan akan menggunakan MatLab 2009b. Parameter yang digunakan dalam penelitian ini adalah waktu, curah hujan, kelembaban dan suhu. Hasil penelitian juga dibandingkan dengan hasil tes dan prediksi BPNN BMKG. Dari hasil penelitian yang dilakukan dari tahap awal untuk menguji dan pengukuran, hasil dari ENN ini memiliki prediksi curah hujan dengan akurasi yang lebih baik daripada BPNN dan prediksi algoritma BMKG.
Kata kunci: Prediksi, curah hujan, berkembang jaringan saraf.
ABSTRACT
Weather and climate information have contributed as one consideration for decision makers. This arises because the infor¬mation the weather / climate has economic value in a variety of activities , ranging from agriculture to flood control . From the data obtained implied that the current rainfall prediction not so accurate . Forecasts are often given to the public on a regular basis is the weather forecast , not the amount of rainfall. This study uses an algorithm Evolving Neural Network (ENN) as an approach to predict the rainfall , the data processing and calculations will use MatLab 2009b . The parameters used in this study is time , rainfall , humidity and temperature . The results also compared with the test results and predictions BPNN BMKG. From the results of research conducted from early stage to test and measurement , the application of this ENN has a rainfall prediction with accuracy better than the BPNN and prediction algorithms BMKG.
Keywords : Prediction , rainfall , evolving neural network.
I. PENDAHULUAN
Informasi cuaca dan iklim mempunyai andil sebagai salah satu bahan pertimbangan bagi pembuat keputusan. Oleh karena itu, sangat bijaksana apabila informasi cuaca dan iklim diperhatikan dan digunakan sebagai salah satu bahan pertimbangan untuk mengambil keputusan, baik pada waktu sebelum maupun selama melakukan kegiatan. Hal ini muncul karena informasi cuaca/ iklim mempunyai nilai ekonomi dalam berbagai kegiatan, mulai dari pertanian sampai dengan pengendalian banjir. Selain itu, informasi cuaca/ iklim mempunyai peran dalam memantau anomali/ perubahan iklim secara global sebagai dampak peningkatan aktivitas manusia [1]. Data curah hujan biasanya tercatat secara harian. Zhi-liang Wang dan Hui-hua Sheng [2] mengusulkan penerapan General Regression Neural Network (GRNN) untuk memprediksi curah hujan tahunan di Zhengzhou. Mereka membuktikan bahwa GRNN memiliki kelebihan ketika prosesfitting danprediksi dibandingkan dengan Backpropagation Neural Network (BP-NN) dan metode stepwise regression analysis (SRA). Hasil simulasi GRNN untuk curah hujan tahunan lebih baik dengan akurasi yang lebih tinggi daripada BP-NN.Pei-Chann Chang and Yen-Wen Wang [3] menggunakan Evolving Neural Network untuk memprediksi produksi PCB dengan hasil yang baik.
Dengan demikian, salah satu pendekatan untuk menetapkan nilai bobot pada neural network adalah algoritma genetika, yang selain Pei-Chann Chang and Yen-Wen Wang [3] , juga telah diterapkan oleh Li Chungui, Xu Shu’an, dan Wen Xin [4] untuk prediksi arus lalu lintas. Ganji Huang dan Lingzhi Wang [5] melatih Neural network untuk peramalan hidrologi. Ketiga penelitian tersebut mengusulkanneural network yang berevolusi. Model ini menyerap beberapa manfaat dari algoritma genetika dan neural network. Hasil perbandingan menunjukkan bahwa model yang disarankan dapat meningkatkan akurasi prediksi.
Rumusan masalah dari penelitian ini adalah:
Akurasi prediksi curah hujan saat ini belum begitu akurat, sehingga perlu pendekatan yang dapat menghasilkan presiksi curah hujan yang lebih akurat.Karena itu penelitian ini akan menggunakan algoritma Evolving Neural Network (ENN) sebagai pen-dekatan untuk memprediksi curah hujan. Sehinggga Question Research: “Bagaimana penerapan algoritma Evolving Neural Network untuk memprediksi curah hujan dengan lebih akurat.
Tujuan penelitian ini adalah:
1
JTIULM - Volume 1, Nomor 1, Januari-Juni 2016: 1-8
Berdasarkan latar belakang dan rumusan masalah diatas, maka penelitian ini bertujuan untuk menerapkan algoritma
Evolving Neural Network (ENN) untuk memprediksi curah hujan dengan lebih akurat .
Manfaat dari penelitian ini adalah:
a. Bagi Penulis
Menerapkan dan menggunakan teori yang didapat dibangku kuliah terutama yang berhubungan dengan algoritma Evolving
Neural Network (ENN).
b. Bagi Instansi
Hasil penelitian ini diharapkan mampu menjadi alat prediksi curah hujan BMKG khususnya Stasiun Klimatologi Banjarbaru,
Kalimantan Selatan.
c. Bagi Pembaca
Dapat memberikan gambaran dan pemahaman tentang bagaimana penerapan algoritma Evolving Neural Network (ENN) da
lam studi kasus data rentet waktu curah hujan.
II. TINJAUAN PUSTAKA
A. Hujan, Curah Hujan, Cuaca dan Iklim
Hujan merupakan salah satu bentuk presipitasi uap air yang berasal dari awan yang terdapat di atmosfer. Bentuk presipitasi lainnya adalah salju dan es. Untuk dapat terjadinya hujan diperlukan titik-titik kondensasi, amoniak, debu dan asam belerang. Titik-titik kondensasi ini mempunyai sifat dapat mengambil uap air dari udara [1].
Curah Hujan (CH) adalah jumlah air yang jatuh di permukaan tanah datar selama periode tertentu yang diukur dengan satuan tinggi (mm) di atas permukaan horizontal bila tidak terjadi evaporasi, runoff dan infiltrasi. Satuan CH adalah mm, inch [1].
Cuaca adalah keadaan variable atmosfer secara keseluruhan disuatu tempat dalam selang waktu yang pendek. Sedang secara fungsi komponen, cuaca adalah keadaan atmosfer yang dinyatakan dengan nilai berbagai parameter, antara lain suhu, tekanan, angin, kelembaban dan berbagai fenomena hujan, disuatu tempat atau wilayah selama kurun waktu yang pendek.
Iklim adalah sintesis kejadian cuaca selama kurun waktu yang panjang, yang secara statistik cukup dapat dipakai untuk menunjukkan nilai statistik yang berbeda dengan keadaan pada setiap saatnya. Dapat juga dinyatakan sebagai konsep abstrak yang menyatakan kebiasaan cuaca dan unsur-unsur atmosfer disuatu daerah selama kurun waktu yang panjang [6].
B. Data Mining
Data mining didefinisikan sebagai proses menemukan pola dalam data. Proses itu harus otomatis atau semi otomatis. Pola yang ditemukan harus berguna sehingga dapat memberikan keuntungan [7]. Data mining adalah proses menemukan hubungan yang bermakna, pola dan tren dengan memilah-milah sejumlah besar data dengan menggunakan teknologi pengenalan pola serta statistik dan teknik matematika [8].
C. Neural Network
Jaringan saraf tiruan adalah sebuah alat penting untuk pemodelan kuantitif. Jaringan saraf tiruan sangat populer digunakan di antara para peneliti dan praktisi selama 20 tahun terakhir dan telah berhasil diterapkan untuk memecahkan suatu varietas masalah di hampir semua bidang bisnis, industri, dan ilmu pengetahuan (Widrow, Rumelhart & Lehr, 1994). Saat ini, jaringan saraf tiruan diperlakukan sebagai standar alat data mining dan digunakan untuk tugas-tugas data mining seperti mengklasifi-kasikan pola, analisis data rentet waktu, prediksi, dan clustering [9].
D. Evolving Neural Networks (ENN)
Evolving Neural Network adalah sebuah cara pembobotan antar neuron pada layer berbeda dengan menggunakan prinsip algoritma genetika (GA). Gambar berikut menjelaskan struktur algoritma ENN [10]:
2
Cipta — Penerapan Algoritma Evolving Neural Network Untuk Prediksi Curah Hujan
Gambar. 1. Struktur Algoritma Evolving Neural Network
Proses algoritma ENN [3] [11] [12]:
1. Encoding Represantasi Kromosom (encoding)
Setiap gen mempresentasikan bobot antara dua neuron di layer berbeda. Sebuah kromoson dibangun dari rangkaian gen
yang diilustrasikan dalam gambar 1.
Gambar. 2. Proses Encoding Kromoson
Contoh, Gen pertama dalam untaian kromoson adalah W15, yaitu bobot yang menghubungkan neuron 1 dan neuron 5. Gen kedua adalah W16, yaitu bobot penghubung neuron 1 dan 6. Demikian seterusnya.
2. Menentukan populasi awal dari kromosom (Initial Population)
Bobot awal ditentukan acak dari 0 sampai 1. Bobot dalam kromosom akan dievaluasi oleh operator GA.
3. Menghitung nilai objektif setiap kromosom. (FF-NN)
Mean Absolute Percentage Error (MAPE) digunakan sebagai fungsi untuk mengevaluasi simpangan data traning selama
proses training.
(1) Menghitung output dari hidden layer
(2) Menghitung output dari output layer
(3) Menghitung error
(4) Evaluasi nilai MAPE dari g(s)
4. Menghitung fungsi fitness. (Compute fitness value)
Suatu individu dievaluasi berdasarkan suatu fungsi tertentu sebagai ukuran nilai kualitasnya. Fungsi ini disebut fungsi
fitness.
fit(s) =1 – g(s)
3
JTIULM - Volume 1, Nomor 1, Januari-Juni 2016: 1-8
5. Seleksi orang tua (Reproduction/Selection)
Kemungkinan p(s) dari setiap kromosom s terpilih sebagai orang tua didefinisikan:
6. Mutasi Silang (Crossover)
GA menjalankan oprator mutasi silang terhadap pasangan orang tua berdasarkan probabilitas tertentu. Operasi yang dil
akukan adalah operasi 2 titik (2-point crossover)
7. Mutasi (Mutation)
Operator GA menjalankan mutasi pada setiap kromosom. Operasi yang adalah mutasi 1 titik (1-point mutation) untuk
mendapakan solusi yang optimal.
8. Strategi Elite (Elite Strategy)
Operasi yang dilakukan adalh memilih 50% solusi teratas berurutan untuk mempertahankan kualitas solusi yang ada pada setiap generasi. Dengan kata lain, perlu peng-kopi-an individu terbaik pada suatu generasi untuk dimasukkan sebagai anggota populasi generasi berikutnya. Sehingga populasi baru memiliki minimal satu individu terbaik yang kualitasnya sama atau lebih baik dibandingkan dengan generasi sebelumnya.
9. Seleksi (Replacement)
Populasi baru yang dihasilkan dari proses sebelumnya menggantikan seleuruh populasi generasi lama.
10. Kriteria penghentian (Stopping Criteria)
Jika jumlah generasi telah sama dengan nilai maksimumnya, maka proses akan berhenti. Jika belum, proses kembali ke
langkah 3.
11. Prediksi (Forecast and Recall)
Fungsi MAPE digunakan untuk mengevaluasi akurasi dari setiap prediksi yang dihasilkan ENN.
III. METODE PENELITIAN
Gambar. 3. Tahapan Metode Penelitian Eksperimen
A. Metode Pengumpulan data
Penelitian ini memakai data curah hujan dan kelembaban yang didapatkan dari BMKG – Stasiun Klimatologi Banjarbaru, Kalimantan Selatan. Data yang dibutuhkan dalam penelitian ini adalah data sekunder dan data primer.
B. Metode Pengolahan Data Awal
Data yang didapatkan dari instasi terkait masih berupa data yang terdiri dari berbagai parameter, sehingga harus direkapit-ulasi terlebih dahulu. Rekapitulasi tersebut dilakukan dengan memperhatikan kebutuhan. Data hasil proses ini adalah data
4
Cipta — Penerapan Algoritma Evolving Neural Network Untuk Prediksi Curah Hujan
dengan atribut: Waktu, Curah hujan, Kelembaban dan Suhu. Tiap baris data adalah data setiap bulan selama 10 tahun, sehingga tersedia 120 baris data.
TABEL.1.
DATA REKAPITULASI TAHUN 2001-2010
Tahun Bulan Hujan lembab suhu
2001 Jan-01 354,6 90,0 26,23
Feb-01 145,2 89 26,63
Mar-01 234,2 89 26,81
Apr-01 206,8 89 27
... ... ... ... ...
2010 Sep-10 338,2 88,92 25,94
Okt-10 256,5 86,01 26,65
Nop-10 317,5 87,56 26,46
Des-10 354,7 87,68 25,86
C. Metode Yang Diusulkan
Berdasarkan related research pada bab sebelumnya, nilai akurasi prediksi data rentet waktu menggunakan metode Evolving Neural Network (ENN) dinyatakan lebih akurat sehingga metode yang diusulkanuntuk memprediksi curah hujan adalah Evolv¬ing Neural Network (ENN) dan akan diimplementasikan dengan menggunakan MatLab 2009b.
D. Neural Network Backpropagation (BPNN)
Gambar.4. Hasil coding Matlab BPNN-LM
Gambar. 5. Nilai MSE
E. Model Evolving Neural Network (ENN)
Lima parameter penting algoritma genetika [12], yaitu: tipe crossover dan nilainya, tipe mutasi dan nilainya serta tipe sele-ksi, diatur menggunakan desain eksperimen Taguchi [3]. Desain eksperimen Taguchi digunakan untuk pengaturan parameter
5
JTIULM - Volume 1, Nomor 1, Januari-Juni 2016: 1-8
parameter itu. Pendekatan dilakukan dengan perhitungan Signal-to-Noise (S/N), untuk mendapatkan kombinasi yang lebih baik. S/N didefinisikan [13]:
4
S/N = 10× log
5
5 8
× 794 𝑦7 (11)
Di mana:
n = banyaknya eksperimen (baris data)
yi = hasil eksperimen ke-i
Kode dan level untuk masing-masing parameter tersebut:
TABEL II
DATA REKAPITULASI TAHUN 2001-2010
Parameter/kode Level 1 Level 2 Level 3 Level 4
Cross over (A) 1-Point 1-Point 2-Point 2-Point
Mutasi (B) 1-Point 1-Point Shift Shift
Seleksi (C) Total Total Elite Elite
Nilai Cross over (D) 0,2 0,4 0,6 0,8
Nilai Mutasi (E) 0,1 0,3 0,5 0,7
Menurut Pei-Chann Chang dkk [14], profil konvergensi untuk data time-series semi musiman menunjukkan bahwa sistem dapat bertemu setelah 2.000 generasi bahkan untuk ukuran populasi yang kecil misalnya 10. Namun, untuk ukuran populasi konvergensi cepat dan halus didapatkan pada populasi 40 atau 50, yang konvergen ke kondisi cukup stabil setelah sekitar 300 generasi.
Gambar. 6. Nilai MAPE
Oleh karena itu, dalam pengujian akan digunakan ukuran populasi 50 sebagai populasi awal untuk percobaan.
Dengan data curah hujan sebagai input dan target, dilakukan tiga kali percobaan algoritma genetika (tanpa neural network) dengan populasi 50, dan parameter masing-masing operator di-setting sesuai level rencana, kemudian menghitung rata-rata S/N rasio dari setiap tingkat faktor, hasilnya ditunjukkan pada tabel berikut:
TABEL III
RATA-RATA S/N RASIO
Faktor
A B C D E
Level 1 12,19 24,00 12,15 14,31 14,79
Level 2 12,98 23,47 13,02 14,40 15,14
Level 3 17,60 5,91 11,78 14,96 14,36
Level 4 16,51 5,91 16,99 15,62 15,05
Dari tabel ini, kombinasi terbaik dari pengaturan parameter dapat ditemukan sebagai (A) 3 - (B) 1 - (C) 4 - (D) 4 - (E) 2 (sebagaimana disorot dalam huruf tebal). Kode-kode ini mewakili dua titik crossover, satu mutasi titik, penggantian strategi elitis, Crossover rate = 0,8, dan tingkat mutasi = 0,3.
6
Cipta — Penerapan Algoritma Evolving Neural Network Untuk Prediksi Curah Hujan
F. Hasil Eksperimen
TABEL IV
EVALUASI TIAP STRUKTUR
Arsitektur FFN Evaluasi
Input Hidden-1 Out MSE RMSE MAPE MAD
4 2 1 7621,414 87,301 29,07 % 74,592
4 4 1 3913,691 62,560 29,76 % 84,717
4 2 1 2528,506 50,284 15,15 % 43,317
4 4 1 1032,203 32,128 19,10 % 57,020
Gambar. 7. Grafik validasi hasil
Terlihat bahwa nilai MSE, RMSE, MAPE dan MAD yang dihasilkan oleh ENN jauh lebih kecil daripada BPNN. Dengan demikian, maka dapat disimpulkan bahwa algoritma Envolving Neural Network memprediksi curah hujan lebih akurat dari pada BPNN-Levenberg-Marquardt dan BPNN-scaled conjugate gradient serta prediksi BMKG, dengan tingkat keakuratan prediksi sebesar 85%.
IV. KESIMPULAN DAN SARAN
A. Kesimpulan
Dari hasil penelitian yang dilakukan dari tahap awal hingga pengujian, dan pengukuran, penerapan ENN ini memiliki pred-iksi curah hujan dengan tingkat akurasi yang lebih baik dibandingkan dengan algoritma BPNN dan prediksi BMKG.
Dengan demikian, adanya penerapan algoritma ENN mampu memberikan solusi bagi petugas maupun petani, serta mampu menjadi alat prediksi curah hujan yang dapat digunakan oleh Stasiun Klimatologi BMKG di setiap kota atau kabupaten yang ada di Kalimantan khususnya.
B. Saran
Berdasarkan hasil penelitian dan pengukuran, penerapan ENN memiliki tingkat akurasi yang lebih baik dalam prediksi curah hujan. Namun berapa hal perlu disampaikan untuk penerapan ENN yang lebih baik:
Data sebagai sumber masukan bagi sistem dapat lebih rinci (perhari) dan dengan jumlah lebih banyak lagi.
GA dapat memperbaiki pembobotan neural network. Walaupun demikian perlu optimasi lebih lanjut, misalnya dengan Prin¬cipal Component Analysis (PCA) [15] atau pun Logika Fuzzy [16] [17], sehingga diharapkan dapat meningkatkan akurasi prediksi curah hujan.
DAFTAR PUSTAKA
[1] Handoko, Ed., Klimatologi Dasar. Jakarta: Pustaka Jaya, 1994.
[2] Zhi-liang Wang and Hui-hua Sheng, "Rainfall Prediction Using Generalized Regression Neural Network: Case study Zhengzhou," in 2010 International Conference on Computational and Information Sciences, 2010, pp. 1265-1268.
[3] Pei-Chann Chang and Yen-Wen Wang, "Using Soft Computing Methods for Time Series Forecasting," in Recent Ad¬vances in Data Mining of Enterprise Data: Algorithms and Applications, P.M. Pardalos, Ed. Singapore: World Scien¬tific, 2007, ch. 4, pp. 189-246.
[4] Li Chungui, Xu Shu’an, and Wen Xin, "Traffic Flow forecasting Algorithm Using Simulated Annealing Genetic BP Network," in 2010 International Conference on Measuring Technology and Mechatronics Automation, 2010, pp. 1043-1046.
7
JTIULM - Volume 1, Nomor 1, Januari-Juni 2016: 1-8
[5] Ganji Huang and Lingzhi Wang, "Hybrid Neural Network Models for Hydrologic Time Series Forecasting Based on Genetic Algorithm," in 2011 Fourth International Joint Conference on Computational Sciences and Optimization, 2011, pp. 1347-1350.
[6] Kadarsah and Ahmad Sasmita, "Standardisasi Metadata Klimatologi Dalam Penelitian Perubahan Iklim Di Indone¬sia," in Prosiding PPI Standardisasi 2010, Banjarmasin, 2010, pp. 1-18.
[7] I.H Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques., 2005.
[8] Daniel T Larose, Discovering Knowledge in Data., 2005.
[9] Oded Maimon Lior Rokach, Data Mining and Knowledge Discovery Handbook., 2010.
[10] Sani Susanto and Dedy Suryadi, Pengantar Data Mining : Menggali Pengetahuan Dari Bongkahan Data. Yogyakarta: Andi, 2010.
[11] Suyanto, Soft Computing: Membangun Mesin Ber-IQ Tinggi. Bandung, Jawa Barat, Indonesia: Informatika, 2008.
[12] Suyanto, Evolutionary Computing: Komputasi Berbasis 'Evolusi' dan 'Genetika'. Bandung: Informatika, 2008.
[13] Philip J. Ross, Taguchi Techniques For Quality Engineering: Loss Function, Orthogonal Experiments, Parameters and Tolerance Design, 2nd ed. New York: Mc Graw-Hill Companies, Inc., 1996.
[14] Pei-Chann Chang, Yen-Wen Wang, and C.Y. Tsai, "Evolving Neural Network for Printed Circuit Board Sales," Expert System Application, vol. 29(1), pp. 83-92, 2005.
[15] Gao Guorong and Liu Yanping, "Traffic Flow Forecasting based on PCA and Wavelet Neural Network," in 2010 In¬ternational Conference of Information Science and Management Engineering, 2010, pp. 158-161.
[16] QianZhang and Tongna Liu, "A Fuzzy Rules and Wavelet Neural Network Method for Mid-long-term Electric Load Forecasting," in Second International Conference on Computer and Network Technology, 2010, pp. 442-446.
[17] Hai-shuang Guan, Wen-ge Ma, and Qiu-ping Wang, "Real-time Optimal Control of Traffic Flow Based on Fuzzy Wavelet Neural Networks," in Fourth International Conference on Natural Computation, 2008, pp. 509-511.
[18] Wint Thida Zaw and Thinn Thu Naing, "Modeling of Rainfall Prediction over Myanmar Using Polynomial Regres¬sion," in 2009 International Conference on Computer Engineering and Technology, 2009, pp. 316-320.
8
A Program Designed to Empower Engineering
Educators
Melany M. Ciampi
Safety, Health and Environment Research Organization
São Paulo, Brazil
melany@copec.org.br
Luis Amaral
Computer Graphics Center
Guimarães, Portugal
amaral@dsi.uminho.pt
Victor F. A. Barros
Science and Education Research Council
Braga, Portugal
victor@copec.org.br
Abstract— Utilizing emerging technologies to provide expanded learning opportunities is critical to the success of future generations. So teachers have to be prepared to motivate and entice the students about getting knowledge pertinent for their formation as engineers. This paper describes the "International Engineering Educator" developed by the engineering education research team of COPEC – Science and Education Research Council. It is offered by the International Institute of Education of COPEC, which is a certification organization in accordance with the Ministry of Education of the Country referring to the National Law of Higher Education. It also offers a professional register as "International Engineering Educator" of the International Society for Engineering Pedagogy for those who are interested in such certification as something else. The target attendees for this program are the engineering community of CPLP – Community of Portuguese Language Countries. It is successful and many interested professionals are attending the program.
Keywords— On line technology; cultural skills; innovation; global vision; leadership.
I. INTRODUCING
It is a fact that most of the engineers in education field at the moment are engineers with a PhD or Master degrees. This does not mean that the professional has all the tools to be a proper teacher and now more than ever, once we have students with peculiarities that were unthinkable not long ago.
No matter the field, students are facing exchanges in the world, which characteristics have consequences for their formation, as happens with engineers. One of them is the constant change in the way people develop the work. This
Claudio da Rocha Brito
Science and Education Research Council
São Paulo, Brazil
cdrbrito@copec.org.br
Rosa Vasconcelos
University of Minho
Guimarães, Portugal
rosa@det.uminho.pt
aspect is enhanced by the quick and strong technology development. Students today require more advice than properly information acquisition once information is now available any time any place in a huge amount, apart from the ability to be connected all the time. Students should know how to use educational technologies to apply knowledge to new situations, analyze information, collaborate, solve problems, and make decisions [1].
It is necessary to have in mind and realize that teachers now use emerging technologies to provide expanded learning opportunities for future generations’ knowledge achievement. So teachers have to prepare themselves to motivate and entice the students on how to get knowledge pertinent for their formation as engineers. So it is possible to state that the use of new technologies in classroom is also an important requirement for teachers in Higher Education, specially in Engineering. It is part of teaching environment and therefore it is necessary to understand the environment of a young pupil. It is clear that life long learning environment is not only for future engineers but also for teachers.
II. HIGHER EDUCATION AND ITS PHILOSOPHICAL ASPECTS
More than ever, education is a key factor for economic success of a nation, as well as personal satisfaction, social stability everywhere and for all levels of society, age groups and subject area.
A philosophical aspect of present education environment however states that it is subjected to the capitalist work demands. It means that schools and universities are exposed to the interests of work market dictated by the capitalism
2014 IEEE Frontiers in Education Conference
978-1-4799-3922-0/14/$31.00 ©2014 IEEE 1925
environment ruled by private sector. The ideals of employability and entrepreneurship have the goal to convince people that they are free as the capitalists as long as they are “their own bosses”. Through this perspective, the competences required by present higher education demand the formation of a professional which main aspect is that s/he has to be capable to know how to be and not so much about the amount of knowledge that s/he has [2].
Therefore, taking into account the present historical moment, preparing a professional means preparing them for the employability and entrepreneurship which main requirement, besides the pertinent knowledge, is the “know how to be”, i.e., the capability to develop personal skills that provide adaptability, flexibility and problem solve mind.
Under this perspective educators are considered the ones responsible for the preparation of citizens according to the values, skills and knowledge that the capital needs.
Anyway, this paradigm of education busted by technology resources is shaping a different kind of education requiring a different kind of educator who, apart from philosophical discussions, has to survive in this extremely competitive market in a global scale.
This means a deep change in the role and profile of educators. This change leads to the following aspects:
• Time for activities that integrates the several disciplines
• Willing to learning altogether with the students and with the experience
• Challenge the students with complex tasks that
enhance them to mobilize their knowledge
• To be aware that s/he is a didactic situation organizer and also a buster of activities that are meaningful and pertinent for them.
In this sense, the role of an educator is not to transmit the knowledge accumulated by humanity; the emphasis of the educational action of educators is to provide the students with tools that help them to understand the world and so to act on it [3].
In this new context, educators have a deep influence in the way education is being developed in classroom. The educators have now to master the art of entice and foster students to pursue a career that is really meaningful for them and to be the best, otherwise it would be very difficult to survive in the work market. It also means that the professional has in her/his hands the possibility to master her/his career despite the historical moment and no matter the work market.
III. ENGINEERING EDUCATOR SKILLS
Knowledge of engineering is an important factor for the new educator as well as to develop some competencies as any professional who has to compete and achieve success in the career.
Besides having the solid knowledge about her/his field of expertise, the engineering professor is expected to have some competences such as:
• Interpersonal skills, which mean to be able to interact with students, listen to them and get information to boost the learning process;
• To be capable of developing a collaborative work environment;
• Leadership and ethical behavior that shows respect;
• Always prepare the classes with organization and didactic;
• To be innovative, open minded for innovations that inspire students;
• Flexibility to accept new ideas and different kinds of personality;
• To be able to learn and make different use of evaluation;
• Global vision and capability of create conceptual models as a competitive differential.
A very important aspect is also that presently many institutions have developed programs to prepare the engineering professor to perform in order to be competitive as professional and to enhance education to form the citizen for this century of uncertainty and challenges [4].
It is not enough that universities as education institutions offer good curriculums, good labs and have top technology available in classroom because, in part, an important institutional competitiveness factor is also the teacher. Still the educator is a key factor for the success of any educational endeavor even in an environment where education is considered as business.
IV. COURSE METHODOLOGY
This program is an adventure toward the discovery of new skills and the acquisition of new tools that will provide the opportunity to develop the capability of performing as educator, always following the new trends in education. Besides this is a program of international certification, which is also a new trend in global education.
It is a modular program, on line, with credits in ECTS with equivalent in hours in accordance to the educational legislation of the Country.
The certification in engineering education requires:
• each module of 60 hours or 12 ECTS
• so 3 modules that add up 180 hours or 36 ECTS [5].
The program is delivered 100% on line.
The program is delivered in Portuguese Language and the target audience is the engineers dedicated to the education and citizens of countries of Portuguese language.
The program is offered to the CPLP – Community of Portuguese Language Countries.
The program is being delivered and the feedback questionnaires answered by the participants are completed after every module conclusion. The numbers and percentages are being computed and by the end of 2014 the full results on satisfaction, relevance for career and impact in daily classroom
2014 IEEE Frontiers in Education Conference
1926
performance are expected. A partial result given by present participants shows the program so far to be very much satisfactory. The flexibility in terms of content access has showed to be an important aspect once it gives a larger possibility of study. It is a point for further examination regarding not only information about quality and success of the program but also as a strategic asset for the future development of the program.
V. LANGUAGE AND FORMATION
Language is part of what we are. With more than 230 million native speakers, Portuguese is the fifth most spoken language in the world [6]. It is spoken in Europe, America, Africa and Asia.
Even in Europe, Portuguese is the third most spoken language, due to emigrant communities spread across the continent.
Portuguese is an official language of the European Union (former EEC) since 1986, date of the admission of Portugal in the Community. Due to the agreements of Mercosur (Southern Common Market), of which Brazil is a party, Portuguese is taught as a foreign language in other countries that participate in it.
In 1996, the Community of Portuguese Language Countries (CPLP), which brings together countries with Portuguese as the official language for the purpose of enhancing cooperation and cultural exchanges between the member countries and to standardize and disseminate the Portuguese language was created [7].
According to the Sapo Observatory, a language is a much more valuable asset the more partners and the more users it has. Language is a super public good, since the sharing increases its value. The economic power of Portuguese speakers represents 4% of global wealth.
About 50% of new oil and gas discoveries made since 2005 are located in countries of Portuguese Language. This will be a transformational factor of geopolitical importance of Portuguese in the global economy.
Indeed, this underlying trend was the cover story in a recent issue of the prestigious Monocle magazine, which ranked Portuguese as "the new global language of power and commerce".
In fact, according to the latest analysis from consultants IHS and Bernstein Analysis three Portuguese-speaking countries lead the ranking of the top 10 discoveries of oil and gas on the planet this decade [8].
VI. SCIENCE AND EDUCATION RESEARCH COUNCIL MAIN
ASPECTS
COPEC has become an international organization with a History that started with an idea shared by some scientists of creating an organization to foster the research mainly in sciences and education. This idea seized larger proportions and after some meetings the Council became reality. This is a
group of scientists, professors and professionals whose vision of future has driven them to start this work.
The main mission of COPEC is to promote the progress of science and technology for the welfare of humanity.
Through its activities, COPEC maintains relations with universities, institutions of education, enterprises and the society of several countries for the discussion of sciences, technology and education directions.
COPEC - Science and Education Research Council has been very active and has developed many achievements of great importance for the Country in which it is located.
The Council is an organization constituted by scientists of several areas of human knowledge committed with education and the development of science and technology.
Its members believe that education is the main beam in the construction of a better society and that sciences and technology are the big agents in the fostering of progress to promote the welfare of human being [9].
It is an organization that is present now in Europe always working for the science and education enhancement targeting the development of a better human life promoting the social, ethical and educational values so important for human conditions in the planet.
VII. A PROGRAM TARGETING THE IMPROVEMENT OF
EDUCATORS IN ENGINEERING FIELD
COPEC - Sciences and Education Research Council main and proud mission is to promote the progress of science and technology for the welfare of the humanity.
The main objective is to promote an apprenticeship community and the development of education and sciences areas constituting an intelligent way of collective knowledge for the integration with social and economic agents of community [10].
COPEC is an organization that develops many activities in science and education. One of them is the IGIP National Monitoring Committee of Brazil, which provides the courses for engineering educators and has also a large experience developing and implementing engineering programs. The engineering education research team has decided to develop and to offer a specific program for engineers dedicated to education.
It is an international organization that has been preparing engineers educators in Europe for more than 34 years and now worldwide [11].
The whole process of offering the course originated in the initiative of both organizations that want to make engineering education training skills and practice targeting the international engineering educators.
For some years IIE COPEC Institution of Education has delivered the program in classroom. However, the number of people who could attend the program was very small once it happened in one geographical region of the country. With the development of on line technology to deliver programs and the
2014 IEEE Frontiers in Education Conference
1927
possibility of reaching a much larger community the institution of education of COPEC decided to offer the program on line.
So the target participant has spread to countries of Portuguese language, which is a large portion of the planet and countries that are becoming economically important in global scenario.
The ministry of education at national level recognizes the certification. Internationally the certification is an INGPAED – IGIP the International Society of Engineering Pedagogy [12].
VIII. ADMISSION REQUIREMENTS
As any graduation program candidates’ requirements for admission are:
• Candidates should have at least master degree in science, engineering, or technology
• Professionals with other backgrounds are considered based on their interest, formal education and experience in teaching [13].
IX. THE PROGRAM MODULES
Core Modules:
• Engineering Education in Theory
• Engineering Education in Practice
• Laboratory Didactics
Theory Modules
• Psychology
• Sociology
• Engineering ethics
Practice Modules
• Presentation and communication skills
• Scientific writing
• Working with projects
• ICT in engineering education
Elective Modules (up to the institutions that deliver the program)
• Intercultural competence
• Evaluation of student performance, grading and assessment
• Quality management
• Curriculum development
• CLIL – Content and Language Integrated Learning
• Portfolio assessment
• Creative thinking
• Collaborative work
• Coaching and mentoring
• Infoliteracy
X. PROGRAM EXPECTED OUTCOMES
There is no doubt that the most valuable result of this program is the quality of teachers prepared to adopt new styles in teaching and also learning [14]. Living is learning and
teaching is learning as well, as the experience of teaching is a two ways path for the construction of knowledge.
The program fosters mainly some skills such as:
• Apply the new knowledge in classroom immediately;
• Generate new ideas, new ways of teaching;
• Learn with the students;
• Create a classroom environment based on confidence, ethic and teamwork.
It indeed helps the engineers educators to look at new styles of teaching as well as to pursue quality classes based on pertinent knowledge developing a two-way flow of information [15].
XI. FINAL DISCUSSIONS
The program is the result of the efforts of COPEC Engineering Education Research team and is offered by its International Institute of Education.
The program has been designed in order to fit the necessities of professionals interested in the improvement of career and quality performance.
The choice to deliver it on line provides the opportunity for a larger audience of Portuguese language, in different countries.
Being a flexible program, it is developed in accordance to the needs for the accomplishment of the main goal: to form engineering educators prepared for the 21st Century demands.
It has an impact in the academic community once it can provide engineers with the opportunity to update knowledge, as it is lifelong education environment from now on.
Besides being a valuable and rich program that provides tools for teaching, the engineering educator receives a certification that is recognized both national and internationally.
Another aspect noticed by the statistics is that, although the program is delivered in Portuguese, some people from Spanish language countries are attending the program. This indicates that the spectrum of on line courses can be even broader for those who, although the mother language is not Portuguese, can understand it in a level that allows them to study any subject.
The information basis for the development of this paper is the work besides the real experience of the members of the COPEC engineering education team. They have a large experience in developing research in engineering education besides the large experience in developing engineering programs for different engineering schools as well as the knowledge of the state of the art in engineering education. This is an international group of engineers dedicated to the development of engineering education internationally.
ACKNOWLEDGMENT
This work is funded by FEDER funds through the Operational Program for Competitiveness Factors
2014 IEEE Frontiers in Education Conference
1928
(COMPETE) and National Funds through FCT - Foundation for Science and Technology under the Project: FCOMP-01-0124-FEDER-022674 and PEst-C/CTM/UI0264/2013.
REFERENCES
[1] http://www.dgrhe.min-edu.pt
[2] http://www.academia-engenharia.org/direscrita/ficheiros/ EngineeringEducationPortugal_FinalReport.pdf
[3] http://www.iadb.org/res/laresnetwork/files/pr294finaldraft.pdf
[4] http://www.iadb.org/res/publications/pubfiles/pubR-463.pdf
[5] Brito, C. da R.; Ciampi, M. M., Braga, M. S.; Braga, E. R. Green Lifestyle becoming the Men's New Way of Life. In: Safety, Health and Environmental World Congress, 14, Cubatão, 2014. Green Lifestyle becoming the Men's New Way of Life. Cubatão: SHERO, 2014.
[6] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Barros,
V. F. A. Environmental Engineering Program Preparing Engineers to Tackle New Challenges. In: IGIP Annual Symposium, 42., Kazan, 2012. The Global Challenges in Engineering Education. Kazan: IGIP, 2013.
[7] http://www.linguaportuguesa.ufrn.br/pt_3.php
[8] http://observatorio-lp.sapo.pt/pt/geopolitica/o-valor-economico-da-lingua-portuguesa/a-era-do-petroleo
[9] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Al-Ubaidi, M. Engineering Education in a Technology-dependent World. In: International Conference on Engineering and Technology Education, 13, Guimarães, 2014. Engineering Education in a Technology-dependent World. Guimarães: INTERTECH 2014.
[10] Brito, C. da R.; Ciampi, M. M. Technological Development, Sustainability: Discussions about International Aspects of Engineering Education. In: IEEE EDUCON Annual Conference, 01, Madrid, 2010. The Future of Global Learning in Engineering Education. Madrid: IEEE, 2010.
[11] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Barros, V. F. A. Challenging Time for Engineering. In: American Society for Engineering Education Annual Conference, 120, Atlanta, 2013. 2013 ASEE Annual Conference Program & Proceedinngs. Atlanta: ASEE, 2013.
[12] Brito, C. da R.; Ciampi, M. M. Forming Engineers for a Growing Demand. In: International Conference on Engineering and Computer Education, 8, Luanda, 2013. Forming Engineers for a Growing Demand. Luanda: ICECE, 2013.
[13] Brito, C. da R.; Ciampi, M. M.; Barros, V. F. A Innovative and Reliable Information Technology for a Sustainable World. In: World Congress on Systems Engineering and Information Technology, 01, Porto, 2013. Innovative and Reliable Information Technology for a Sustainable World. Porto: WCSEIT, 2013.
[14] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Barros, V. F. A. Interdisciplinary Environmental Engineering Program. In: European Society of Engineering Education Annual Conference, 41, Leuven, 2013. Engineering Education Fast Forward. Leuven: SEFI, 2013.
[15] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L. Engineering Education in Countries of Portuguese Language. In: ASEE/IEEE Frontiers in Education Annual Conference, 43, Oklahoma City, 2013. Energigizing our Future. Oklahoma City: FIE, 2013.
2014 IEEE Frontiers in Education Conference
1929
A Program Designed to Empower Engineering
Educators
Melany M. Ciampi
Safety, Health and Environment Research Organization
São Paulo, Brazil
melany@copec.org.br
Luis Amaral
Computer Graphics Center
Guimarães, Portugal
amaral@dsi.uminho.pt
Victor F. A. Barros
Science and Education Research Council
Braga, Portugal
victor@copec.org.br
Abstract— Utilizing emerging technologies to provide expanded learning opportunities is critical to the success of future generations. So teachers have to be prepared to motivate and entice the students about getting knowledge pertinent for their formation as engineers. This paper describes the "International Engineering Educator" developed by the engineering education research team of COPEC – Science and Education Research Council. It is offered by the International Institute of Education of COPEC, which is a certification organization in accordance with the Ministry of Education of the Country referring to the National Law of Higher Education. It also offers a professional register as "International Engineering Educator" of the International Society for Engineering Pedagogy for those who are interested in such certification as something else. The target attendees for this program are the engineering community of CPLP – Community of Portuguese Language Countries. It is successful and many interested professionals are attending the program.
Keywords— On line technology; cultural skills; innovation; global vision; leadership.
I. INTRODUCING
It is a fact that most of the engineers in education field at the moment are engineers with a PhD or Master degrees. This does not mean that the professional has all the tools to be a proper teacher and now more than ever, once we have students with peculiarities that were unthinkable not long ago.
No matter the field, students are facing exchanges in the world, which characteristics have consequences for their formation, as happens with engineers. One of them is the constant change in the way people develop the work. This
Claudio da Rocha Brito
Science and Education Research Council
São Paulo, Brazil
cdrbrito@copec.org.br
Rosa Vasconcelos
University of Minho
Guimarães, Portugal
rosa@det.uminho.pt
aspect is enhanced by the quick and strong technology development. Students today require more advice than properly information acquisition once information is now available any time any place in a huge amount, apart from the ability to be connected all the time. Students should know how to use educational technologies to apply knowledge to new situations, analyze information, collaborate, solve problems, and make decisions [1].
It is necessary to have in mind and realize that teachers now use emerging technologies to provide expanded learning opportunities for future generations’ knowledge achievement. So teachers have to prepare themselves to motivate and entice the students on how to get knowledge pertinent for their formation as engineers. So it is possible to state that the use of new technologies in classroom is also an important requirement for teachers in Higher Education, specially in Engineering. It is part of teaching environment and therefore it is necessary to understand the environment of a young pupil. It is clear that life long learning environment is not only for future engineers but also for teachers.
II. HIGHER EDUCATION AND ITS PHILOSOPHICAL ASPECTS
More than ever, education is a key factor for economic success of a nation, as well as personal satisfaction, social stability everywhere and for all levels of society, age groups and subject area.
A philosophical aspect of present education environment however states that it is subjected to the capitalist work demands. It means that schools and universities are exposed to the interests of work market dictated by the capitalism
2014 IEEE Frontiers in Education Conference
978-1-4799-3922-0/14/$31.00 ©2014 IEEE 1925
environment ruled by private sector. The ideals of employability and entrepreneurship have the goal to convince people that they are free as the capitalists as long as they are “their own bosses”. Through this perspective, the competences required by present higher education demand the formation of a professional which main aspect is that s/he has to be capable to know how to be and not so much about the amount of knowledge that s/he has [2].
Therefore, taking into account the present historical moment, preparing a professional means preparing them for the employability and entrepreneurship which main requirement, besides the pertinent knowledge, is the “know how to be”, i.e., the capability to develop personal skills that provide adaptability, flexibility and problem solve mind.
Under this perspective educators are considered the ones responsible for the preparation of citizens according to the values, skills and knowledge that the capital needs.
Anyway, this paradigm of education busted by technology resources is shaping a different kind of education requiring a different kind of educator who, apart from philosophical discussions, has to survive in this extremely competitive market in a global scale.
This means a deep change in the role and profile of educators. This change leads to the following aspects:
• Time for activities that integrates the several disciplines
• Willing to learning altogether with the students and with the experience
• Challenge the students with complex tasks that
enhance them to mobilize their knowledge
• To be aware that s/he is a didactic situation organizer and also a buster of activities that are meaningful and pertinent for them.
In this sense, the role of an educator is not to transmit the knowledge accumulated by humanity; the emphasis of the educational action of educators is to provide the students with tools that help them to understand the world and so to act on it [3].
In this new context, educators have a deep influence in the way education is being developed in classroom. The educators have now to master the art of entice and foster students to pursue a career that is really meaningful for them and to be the best, otherwise it would be very difficult to survive in the work market. It also means that the professional has in her/his hands the possibility to master her/his career despite the historical moment and no matter the work market.
III. ENGINEERING EDUCATOR SKILLS
Knowledge of engineering is an important factor for the new educator as well as to develop some competencies as any professional who has to compete and achieve success in the career.
Besides having the solid knowledge about her/his field of expertise, the engineering professor is expected to have some competences such as:
• Interpersonal skills, which mean to be able to interact with students, listen to them and get information to boost the learning process;
• To be capable of developing a collaborative work environment;
• Leadership and ethical behavior that shows respect;
• Always prepare the classes with organization and didactic;
• To be innovative, open minded for innovations that inspire students;
• Flexibility to accept new ideas and different kinds of personality;
• To be able to learn and make different use of evaluation;
• Global vision and capability of create conceptual models as a competitive differential.
A very important aspect is also that presently many institutions have developed programs to prepare the engineering professor to perform in order to be competitive as professional and to enhance education to form the citizen for this century of uncertainty and challenges [4].
It is not enough that universities as education institutions offer good curriculums, good labs and have top technology available in classroom because, in part, an important institutional competitiveness factor is also the teacher. Still the educator is a key factor for the success of any educational endeavor even in an environment where education is considered as business.
IV. COURSE METHODOLOGY
This program is an adventure toward the discovery of new skills and the acquisition of new tools that will provide the opportunity to develop the capability of performing as educator, always following the new trends in education. Besides this is a program of international certification, which is also a new trend in global education.
It is a modular program, on line, with credits in ECTS with equivalent in hours in accordance to the educational legislation of the Country.
The certification in engineering education requires:
• each module of 60 hours or 12 ECTS
• so 3 modules that add up 180 hours or 36 ECTS [5].
The program is delivered 100% on line.
The program is delivered in Portuguese Language and the target audience is the engineers dedicated to the education and citizens of countries of Portuguese language.
The program is offered to the CPLP – Community of Portuguese Language Countries.
The program is being delivered and the feedback questionnaires answered by the participants are completed after every module conclusion. The numbers and percentages are being computed and by the end of 2014 the full results on satisfaction, relevance for career and impact in daily classroom
2014 IEEE Frontiers in Education Conference
1926
performance are expected. A partial result given by present participants shows the program so far to be very much satisfactory. The flexibility in terms of content access has showed to be an important aspect once it gives a larger possibility of study. It is a point for further examination regarding not only information about quality and success of the program but also as a strategic asset for the future development of the program.
V. LANGUAGE AND FORMATION
Language is part of what we are. With more than 230 million native speakers, Portuguese is the fifth most spoken language in the world [6]. It is spoken in Europe, America, Africa and Asia.
Even in Europe, Portuguese is the third most spoken language, due to emigrant communities spread across the continent.
Portuguese is an official language of the European Union (former EEC) since 1986, date of the admission of Portugal in the Community. Due to the agreements of Mercosur (Southern Common Market), of which Brazil is a party, Portuguese is taught as a foreign language in other countries that participate in it.
In 1996, the Community of Portuguese Language Countries (CPLP), which brings together countries with Portuguese as the official language for the purpose of enhancing cooperation and cultural exchanges between the member countries and to standardize and disseminate the Portuguese language was created [7].
According to the Sapo Observatory, a language is a much more valuable asset the more partners and the more users it has. Language is a super public good, since the sharing increases its value. The economic power of Portuguese speakers represents 4% of global wealth.
About 50% of new oil and gas discoveries made since 2005 are located in countries of Portuguese Language. This will be a transformational factor of geopolitical importance of Portuguese in the global economy.
Indeed, this underlying trend was the cover story in a recent issue of the prestigious Monocle magazine, which ranked Portuguese as "the new global language of power and commerce".
In fact, according to the latest analysis from consultants IHS and Bernstein Analysis three Portuguese-speaking countries lead the ranking of the top 10 discoveries of oil and gas on the planet this decade [8].
VI. SCIENCE AND EDUCATION RESEARCH COUNCIL MAIN
ASPECTS
COPEC has become an international organization with a History that started with an idea shared by some scientists of creating an organization to foster the research mainly in sciences and education. This idea seized larger proportions and after some meetings the Council became reality. This is a
group of scientists, professors and professionals whose vision of future has driven them to start this work.
The main mission of COPEC is to promote the progress of science and technology for the welfare of humanity.
Through its activities, COPEC maintains relations with universities, institutions of education, enterprises and the society of several countries for the discussion of sciences, technology and education directions.
COPEC - Science and Education Research Council has been very active and has developed many achievements of great importance for the Country in which it is located.
The Council is an organization constituted by scientists of several areas of human knowledge committed with education and the development of science and technology.
Its members believe that education is the main beam in the construction of a better society and that sciences and technology are the big agents in the fostering of progress to promote the welfare of human being [9].
It is an organization that is present now in Europe always working for the science and education enhancement targeting the development of a better human life promoting the social, ethical and educational values so important for human conditions in the planet.
VII. A PROGRAM TARGETING THE IMPROVEMENT OF
EDUCATORS IN ENGINEERING FIELD
COPEC - Sciences and Education Research Council main and proud mission is to promote the progress of science and technology for the welfare of the humanity.
The main objective is to promote an apprenticeship community and the development of education and sciences areas constituting an intelligent way of collective knowledge for the integration with social and economic agents of community [10].
COPEC is an organization that develops many activities in science and education. One of them is the IGIP National Monitoring Committee of Brazil, which provides the courses for engineering educators and has also a large experience developing and implementing engineering programs. The engineering education research team has decided to develop and to offer a specific program for engineers dedicated to education.
It is an international organization that has been preparing engineers educators in Europe for more than 34 years and now worldwide [11].
The whole process of offering the course originated in the initiative of both organizations that want to make engineering education training skills and practice targeting the international engineering educators.
For some years IIE COPEC Institution of Education has delivered the program in classroom. However, the number of people who could attend the program was very small once it happened in one geographical region of the country. With the development of on line technology to deliver programs and the
2014 IEEE Frontiers in Education Conference
1927
possibility of reaching a much larger community the institution of education of COPEC decided to offer the program on line.
So the target participant has spread to countries of Portuguese language, which is a large portion of the planet and countries that are becoming economically important in global scenario.
The ministry of education at national level recognizes the certification. Internationally the certification is an INGPAED – IGIP the International Society of Engineering Pedagogy [12].
VIII. ADMISSION REQUIREMENTS
As any graduation program candidates’ requirements for admission are:
• Candidates should have at least master degree in science, engineering, or technology
• Professionals with other backgrounds are considered based on their interest, formal education and experience in teaching [13].
IX. THE PROGRAM MODULES
Core Modules:
• Engineering Education in Theory
• Engineering Education in Practice
• Laboratory Didactics
Theory Modules
• Psychology
• Sociology
• Engineering ethics
Practice Modules
• Presentation and communication skills
• Scientific writing
• Working with projects
• ICT in engineering education
Elective Modules (up to the institutions that deliver the program)
• Intercultural competence
• Evaluation of student performance, grading and assessment
• Quality management
• Curriculum development
• CLIL – Content and Language Integrated Learning
• Portfolio assessment
• Creative thinking
• Collaborative work
• Coaching and mentoring
• Infoliteracy
X. PROGRAM EXPECTED OUTCOMES
There is no doubt that the most valuable result of this program is the quality of teachers prepared to adopt new styles in teaching and also learning [14]. Living is learning and
teaching is learning as well, as the experience of teaching is a two ways path for the construction of knowledge.
The program fosters mainly some skills such as:
• Apply the new knowledge in classroom immediately;
• Generate new ideas, new ways of teaching;
• Learn with the students;
• Create a classroom environment based on confidence, ethic and teamwork.
It indeed helps the engineers educators to look at new styles of teaching as well as to pursue quality classes based on pertinent knowledge developing a two-way flow of information [15].
XI. FINAL DISCUSSIONS
The program is the result of the efforts of COPEC Engineering Education Research team and is offered by its International Institute of Education.
The program has been designed in order to fit the necessities of professionals interested in the improvement of career and quality performance.
The choice to deliver it on line provides the opportunity for a larger audience of Portuguese language, in different countries.
Being a flexible program, it is developed in accordance to the needs for the accomplishment of the main goal: to form engineering educators prepared for the 21st Century demands.
It has an impact in the academic community once it can provide engineers with the opportunity to update knowledge, as it is lifelong education environment from now on.
Besides being a valuable and rich program that provides tools for teaching, the engineering educator receives a certification that is recognized both national and internationally.
Another aspect noticed by the statistics is that, although the program is delivered in Portuguese, some people from Spanish language countries are attending the program. This indicates that the spectrum of on line courses can be even broader for those who, although the mother language is not Portuguese, can understand it in a level that allows them to study any subject.
The information basis for the development of this paper is the work besides the real experience of the members of the COPEC engineering education team. They have a large experience in developing research in engineering education besides the large experience in developing engineering programs for different engineering schools as well as the knowledge of the state of the art in engineering education. This is an international group of engineers dedicated to the development of engineering education internationally.
ACKNOWLEDGMENT
This work is funded by FEDER funds through the Operational Program for Competitiveness Factors
2014 IEEE Frontiers in Education Conference
1928
(COMPETE) and National Funds through FCT - Foundation for Science and Technology under the Project: FCOMP-01-0124-FEDER-022674 and PEst-C/CTM/UI0264/2013.
REFERENCES
[1] http://www.dgrhe.min-edu.pt
[2] http://www.academia-engenharia.org/direscrita/ficheiros/ EngineeringEducationPortugal_FinalReport.pdf
[3] http://www.iadb.org/res/laresnetwork/files/pr294finaldraft.pdf
[4] http://www.iadb.org/res/publications/pubfiles/pubR-463.pdf
[5] Brito, C. da R.; Ciampi, M. M., Braga, M. S.; Braga, E. R. Green Lifestyle becoming the Men's New Way of Life. In: Safety, Health and Environmental World Congress, 14, Cubatão, 2014. Green Lifestyle becoming the Men's New Way of Life. Cubatão: SHERO, 2014.
[6] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Barros,
V. F. A. Environmental Engineering Program Preparing Engineers to Tackle New Challenges. In: IGIP Annual Symposium, 42., Kazan, 2012. The Global Challenges in Engineering Education. Kazan: IGIP, 2013.
[7] http://www.linguaportuguesa.ufrn.br/pt_3.php
[8] http://observatorio-lp.sapo.pt/pt/geopolitica/o-valor-economico-da-lingua-portuguesa/a-era-do-petroleo
[9] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Al-Ubaidi, M. Engineering Education in a Technology-dependent World. In: International Conference on Engineering and Technology Education, 13, Guimarães, 2014. Engineering Education in a Technology-dependent World. Guimarães: INTERTECH 2014.
[10] Brito, C. da R.; Ciampi, M. M. Technological Development, Sustainability: Discussions about International Aspects of Engineering Education. In: IEEE EDUCON Annual Conference, 01, Madrid, 2010. The Future of Global Learning in Engineering Education. Madrid: IEEE, 2010.
[11] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Barros, V. F. A. Challenging Time for Engineering. In: American Society for Engineering Education Annual Conference, 120, Atlanta, 2013. 2013 ASEE Annual Conference Program & Proceedinngs. Atlanta: ASEE, 2013.
[12] Brito, C. da R.; Ciampi, M. M. Forming Engineers for a Growing Demand. In: International Conference on Engineering and Computer Education, 8, Luanda, 2013. Forming Engineers for a Growing Demand. Luanda: ICECE, 2013.
[13] Brito, C. da R.; Ciampi, M. M.; Barros, V. F. A Innovative and Reliable Information Technology for a Sustainable World. In: World Congress on Systems Engineering and Information Technology, 01, Porto, 2013. Innovative and Reliable Information Technology for a Sustainable World. Porto: WCSEIT, 2013.
[14] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L.; Barros, V. F. A. Interdisciplinary Environmental Engineering Program. In: European Society of Engineering Education Annual Conference, 41, Leuven, 2013. Engineering Education Fast Forward. Leuven: SEFI, 2013.
[15] Brito, C. da R.; Ciampi, M. M.; Vasconcelos, R. M.; Amaral, L. Engineering Education in Countries of Portuguese Language. In: ASEE/IEEE Frontiers in Education Annual Conference, 43, Oklahoma City, 2013. Energigizing our Future. Oklahoma City: FIE, 2013.
2014 IEEE Frontiers in Education Conference
1929
IEEE CAMAD 2016 will be held as a stand-alone event at Ryerson University, Toronto, ON, Canada. IEEE CAMAD will focus on Communications for Smart Cities this year. IEEE CAMAD will be hosting several special sessions, and will bring together scientists, engineers, manufacturers and service providers to exchange and share their experiences and new ideas focusing on research and innovation results under wireless communications in smart cities. In addition to contributed papers, the conference will also include keynote speeches, panel and demo sessions.
Areas of Interest
IEEE CAMAD is soliciting papers describing original work, unpublished and not
currently submitted for publication elsewhere, on topics including, but not
limited to, the following:
• Autonomic Communication Systems and Self-Organized Networks
• Big data in the network
• Body Area Networks and Applications
• Cloud Computing, Network Virtualization and SDN
• Cognitive Radio and Network Design
• Communications for Smart Home and Community
• Cross-Layer & Cross-System Protocol Design
• Design of Content Aware Networks and Future Media Internet
• Design of Satellite Networks
• Design, Modeling and Analysis of Wireless, Mobile, Ad hoc and Sensor Networks
• Design, Modeling and Analysis of Network Services and Systems
• Design, modeling and analysis of Ubiquitous sensing in IoT
• Future Service Oriented Internet Design
• Green Wireless Communication Design
• Modeling and Simulation Techniques for Integrated Communication Systems
• Modeling and Simulation Techniques for IoT and M2M Communications
• Modeling and Simulation Techniques for Mobile Social Networks
• Next Generation Mobile Networks
• Network Monitoring and Measurements
• Network Optimization and Resource Provisioning
• Next Generation Internet
• Optical Communications & Fiber Optics
• Network Management, Middleware Technologies and Overlays
• Quality of Experience: Framework, Evaluation and Challenges
• Seamless Integration of Wireless, Cellular and Broadcasting Networks with Internet
• Fast Simulation Techniques for Communication Networks
• Simulation Techniques for Large-Scale Networks
• Smart Grids: Communication, Modeling and Design
• Test Beds and Real Life Experimentation
• Traffic Engineering, Modeling and Analysis
• Validation of Simulation Models with Measurements
Submissions
Perspective authors are invited to submit a full paper of not more than six (6) IEEE style pages including results, figures and references. Papers should be submitted via EDAS (https://edas.info/newPaper.php?c=22298). Papers submitted to the conference, must describe unpublished work that has not been submitted for publication elsewhere. All submitted papers will be reviewed by at least three TPC members. All accepted and presented papers will be included in the conference proceedings and IEEE digital
library. Important Dates
Paper submission 5th June 2016 (extended)
Author notification: 5th July 2016
Camera ready: 1st August 2016
General Chairs
Jelena Misic, Ryerson University, ON, Canada Burak Kantarci, Clarkson University, NY, USA
Technical Program Chairs
Dongmei Zhao, McMaster University, ON, Canada Petros Spachos, University of Guelph, ON, Canada Marco Di Renzo, Paris-Saclay University / CNRS, France
Ioannis Papapanagiotou, Netflix, USA
Panel Chairs
Stefano Giordano, University of Pisa, Italy Yves Lostanlen, SIRADEL, Toronto, ON, Canada
Tutorial Chairs
John Vardakas, IQUADRAT, Spain
Isaac Woungang, Ryerson University, Canada
Demo Chairs
Luca Foschini, University of Bologna, Italy
Yaoqing Liu, Clarkson University, NY, USA
Damla Turgut, University of Central Florida, FL, USA
Publicity Chairs
Octavia Dobre, Memorial University of
Newfoundland, Canada
Kun Yang, University of Essex, UK
William Liu, Auckland University of Technology,
New Zealand
Wenjia Li, New York Institute of Technology, USA
Kambiz Ghazinour, Kent State University, OH, USA
Publication Chair
Mujdat Soyturk, Marmara University, Turkey Michele Nogueira, Federal University of Parana, Brazil
Keynote Speakers Chairs
Michael Devetsikiotis, North Carolina State
University, NC, USA
Dzmitry Kliazovich, Univesity of Luxembourg,
Luxembourg
Hazem Refai, University of Oklahoma, OK, USA
Award Chairs
Nelson Fonseca, State University of Campinas,
Brazil
Fabrizio Granelli, University of Trento, Italy
Christos Verikoukis, CTTC, Spain
Charalabos Skianis, University of Aegean, Greece
Special Sessions Chairs
Jeanna Matthews, Clarkson University, NY, USA Tassos Dagiuklas, Hellenic Open University, Greece Kemal Tepe, University of Windsor, Canada
Local Arrangement Chair
Vojislav Misic, Ryerson University, ON, Canada
Web and social media Chair
Kambiz Ghazinour, Kent State University, OH, USA Nima Zaler, March Networks, Canada
What is a Stem Cell?
Most cells in the human body have an assigned purpose. They are liver cells, fat cells, bone cells, and so on. These cells can replicate more of their own kind of cell, but they cannot differentiate into another kind of cell.
Stem cells are the primitive cells from which all other cells developed. They are undifferentiated cells with the ability not only to self-replicate, but to specialize to become different types of human cells. There are several types of stem cells, but the kinds used in orthopedic stem cell therapy are
called mesenchymal stem cells (MSCs).
An MSC is a cell with strong potential for tissue repair because it can:
• Self-replicate
• Reduce inflammation
• Combat cell death
• Differentiate into more than one specialized cell of the body (including bone cells, muscle cells, cartilage cells, and fat cells)
In medical research, tissues such as muscles, cartilage, tendons, ligaments, and vertebral discs have shown some capacity for self repair. As a result, tissue engineering and the use of mesenchymal cells and/or bio-active molecules such as growth factors are being tested and studied to determine the role they can play in tissue regeneration and tissue repair.
How Does Orthopedic Stem Cell Therapy Work?
Mesenchymal stem cells (MSCs) are adult stem cells that can be found in bone marrow. Doctor Moore performs autologous stem cell therapy, which means that the stem cells used in your treatment are taken from your own body, not from a donor. Using your own stem cells for the procedure helps reduce your risk of infection and eliminate the possibility of immune rejection.
In an autologous stem cell procedure, your physician will draw a sample of fat or bone marrow from the abdomen or hip. The sample is then filtered and concentrated in a sterile environment, then injected into the area of your body that you are trying to heal. This procedure is done on an outpatient basis while under sedation and leaves minimal scarring.
The idea behind orthopedic stem cell therapy is that the injection of these concentrated regenerative cells at an area of your body experiencing degeneration will kick start your body’s ability to heal itself. These injections can be given independent of or in conjunction with an orthopedic surgical procedure
Orthopedic Areas of Interest for Stem Cell Therapy
• Articular Cartilage – Damage to the articular cartilage following an injury has poor potential for repair and can lead to arthritic changes many years after injury. Recent studies have shown favorable outcomes and better knee scores at 2 year follow up for mesenchymal cells compared to current techniques of microfracture and autologous chondrocyte implantation.
• Bone – Trauma and some pathological conditions can lead to extensive bone loss, which requires transplantation of bone and other bone substitutes to restore structural integrity. A large number of studies have shown great potential for mesenchymal cells to repair critically sized bone defects, noting better bone growth and more robust bone formation than controlled groups.
• Tendons and Ligaments – Injuries to tendons and ligaments heal by forming inferior quality tissue. Autografts, allografts, and resorbable materials have been used to repair defects in tendons and ligaments, but these carry risks including donor site morbidity, scar formation, and tissue rejection. A number of studies on the use of mesenchymal stem cells to improve the repair of tendons and tendon defects have been carried out with favorable results when measured in histology and tissue strength. The use of mesenchymal cells with tissue allografts enhances the graft and improves the biomechanical properties compared to control studies.
• Meniscus – Most tears of the meniscus occur in avascular zones with little or no potential for repair. Standard biological healing processes produce limited results and meniscectomy (removal of all or part of the torn meniscus) has been shown to have a strong association with subsequent development of osteoarthritis. Recently, studies have shown that self-paced therapy including mesenchymal stem cells demonstrates biological healing and adherence of meniscal tears in avascular zones.
• Spine – Degeneration of intervertebral discs is a common cause of back pain and morbidity. Most patients are treated conservatively with improvement in approximately 90%. If conservative treatment proves ineffective, the surgical options for discogenic back pain are
limited and usually invasive. Cell-based tissue treatments, including mesenchymal stem cell injections for degenerative disc subjects have been shown to diminish the incidence of low back pain, with clinical results noting improvement in back pain and MRI results showing regeneration of disc tissue. In cases where spinal fusions are necessary, the use of stem cells has shown greater success in obtaining fusion through bone formation as compared to standard fusion techniques.
• Osteonecrosis – Osteonecrosis or avascular ischemia of the hip can be associated with progression to an advanced arthritic joint. Standard treatment for osteonecrosis has included core decompression with limited results. Studies report improvement in hip scores in patients treated with mesenchymal stem cells and core decompression versus core decompression alone.
Is Orthopedic Stem Cell Therapy Covered By My Insurance?
No. Because mesenchymal stem cell injections are considered investigational for orthopedic applications, most insurance companies will not cover the cost. Please contact our office to discuss cash payment options.
What is the Cost of Orthopedic Stem Cell Therapy?
Your out-of-pocket cost will vary, depending upon whether you have stem cell therapy independent of or in conjunction with another surgical procedure.
References:
1. Fu, Tsai-Sheng, et al, Enhancement of Posterolateral Lumbar Spine Fusion Using Low-Dose rhBMP-2 and Cultured Marrow Stromal Cells, Journal of Orthopaedic Research March 2009, 380-4.
2. Centeno, Christopher, et al, Regeneration of meniscus cartilage in a knee treated with percutaneously implanted autologous mesenchymal stem cells, Medical Hypotheses (2008) 71, 900–908.
3. Kovacevic BS, David et al, Biological Augmentation of Rotator Cuff Tendon Repair, Clin Orthop Relat Res (2008) 466:622–633.
4. Rotini, Roberto et al, New perspectives in rotator cuff tendon regeneration: review of tissue engineered therapies, Chir Organi Mov (2008) 91:87–92.
5. Tow B., et al, Disc Regeneration: A Glimpse of the Future, 2007 The Congress of 128 Neurological Surgeons.
6. Chen, Faye, et al, Technology Insight: adult stem cells in cartilage regeneration and tissue engineering, NATURE CLINICAL PRACTICE RHEUMATOLOGY, JULY 2006 VOL 2 NO 7, 373-382.
7. Zantop, Thore et al, Extracellular Matrix Scaffolds Are Repopulated by Bone Marrow-Derived Cells in a Mouse Model of Achilles Tendon Reconstruction, Journal of Orthopaedic Research June 2006, 1299-1309.
8. Acosta F., Lotz J., Ames C., et al, The potential role of mesenchymal stem cell therapy for intervertebral disc degeneration: a critical overview, Neurosurg Focus 19 (3):E4, 2005.
9. Baksh D., et al, Adult mesenchymal stem cells: characterization, differentiation, and application in cell and gene therapy, J. Cell. Mol. Med. Vol 8, No 3, 2004 pp. 301-316.
10. Crevesten G., et al, Intervertebral Disc Cell Therapy for Regeneration: Mesenchymal Stem Cell Implantation in Rat Intervertebral Discs, Annals of Biomedical Engineering, Vol. 32, No. 3, March 2004 pp. 430–434.
11. Hicok, K. C., et al, Human adipose-derived adult stem cells produce osteoid in vivo, Tissue Engineering, Vol. 10(3-4), 371-80, 2004.
12. Javazon, E., Mesenchymal stem cells paradoxes of passaging, Expiremental Hematology, 32(5)- 414-25, 2004.
13. Loening, A. M., et al, AMIDE: A Free Software Tool for Multimodality Medical Image Analysis, Molecular Imaging, Vol. 2(3), 131-7, 2003.
14. Heck J., et al, A Classification System for the Assessment of Lumbar Pain in Athletes, Journal of Athletic Training 2000;35(2):204–211.
15. Kraus, K., Critically sized osteo-periosteal femoral defects: a dog model, Journal of Investigative Surgery, Vol. 12(2), 115-24, 1999.
16. Hernigou, Philippe et al, Abnormalities in the Bone Marrow of the Iliac Crest in Patients Who Have Osteonecrosis Secondary to Corticosteroid Therapy or Alcohol Abuse, J Bone Joint Surg Am. 1997;79:1047-53.
17. Nakajima MD, Takashi, et al, Evaluation of Posterolateral Spinal Fusion Using Mesenchymal Stem Cells, SPINE Volume 32, Number 22, pp 2432–2436.
Department of
Mathematics and Natural Sciences
Dr. Jonathan Corbett Associate Professor of Mathematics CorbettJ@hssu.edu
Degrees Doctor of Philosophy
Held: Washington University Mathematics Bachelor of Arts University of California, Mathematics Master of Arts Washington University Mathematics
Bio: Jonathan Corbett California. After Berkeley where graduate school harmonic analysis
Afterwards, he Department of At the end of his Biostatistics at Department of
(314) 340-3319
HGA 317
Berkeley
grew up in a small logging town in rural Northern
high school, he attended the University of California at
he obtained a B.A. in Mathematics. After this, he attended
at Washington University in St. Louis where he studied
and received his Ph.D. in 1999.
took a postdoctoral Fellowship in Statistical Genetics in the Psychiatry at the Washington University School of Medicine.
postdoc, he accepted a faculty position in the Division of Washington University in 2002 and moved to the
Genetics in 2005.
In 2007, he left the Department of Genetics to return to his first love:
teaching and studying mathematics. This paid off in 2009 when he accepted a tenure-track position as an Assistant Professor of Mathematics at Harris
Stowe State University.
Courses: At Harris-Stowe:
ALG 0036 and 0038: Developmental Algebra
MATH 0120: Structures of Mathematical Systems
MATH 0135: College Algebra
MATH 0136: Finite Math
MATH 0170: Calculus and Analytic Geometry I
MATH 0203: Applied Calculus for Business Majors
MATH 0240: Foundations of Advanced Math
MATH 0241: Calculus and Analytic Geometry II
MATH 0242: Calculus and Analytic Geometry III
MATH 0320: Introduction to Modern Algebra
MATH 0321: Abstract Algebra
MATH 0327: Introduction to Topology
MATH 0375: Introduction to Real Analysis
MATH 0456: Linear Algebra
MATH 0461: Differential Equations
Publications: (Selected Publications)
J.-P. Leduc, J. Corbett, M.V. Wickerhauser. 1998. "Rotational wavelet transforms for motion analysis, estimation, and tracking." Proceedings of the 1998 International Conference on Image Processing, vol. 2, pp. 195
199
J. Corbett, J.-P. Leduc, M. Kong. 1999. "Analysis of deformational transformations with spatio-temporal continuous wavelet transforms."
Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6, pp. 3189-3192
N.L. Saccone, J.M. Kwon, J. Corbett, A. Goate, N. Rochberg, H.J.
Edenberg, T. Foroud, T-K. Li, H. Begleiter, T. Reich, J.P. Rice.
2000. "A genome screen of maximum number of drinks as an alcoholism phenotype." American Journal of Medical Genetics (Neuropsychiatric
Genetics) vol. 96, pp.632-637.
J.P. Rice, N.L. Saccone, J. Corbett. 2001. "The lod score method." In
Advances in Genetics vol. 42, pp 99-113, (DC Rao ed.)
J. Corbett, A. Kraja, I.B. Borecki, M.A. Province. 2003. “Use of a random coefficient regression (RCR) model to estimate growth parameters.”
BMC Genetics, vol 4 (Supplement 1), S5.
J. Corbett, C. Gu, J.P. Rice, T. Reich, M.A. Province, D.C. Rao.
2004. “Power loss for linkage due to the dichotomization of trichotomous phenotypes.” Human Heredity, vol 57(1), pp. 21-27.
J. Corbett, N.L. Saccone, T. Foroud, A. Goate, H.J. Edenberg, J.
Nurnberger Jr., H. Begleiter, T. Reich, J.P. Rice. 2005. "A sex- and
age-adjusted genome screen for nested alcohol dependence
diagnoses." Psychiatric Genetics, vol 15(1), pp. 25-30.
A. Kraja, J. Corbett, P. An, R.S. Lin, P.A. Jacobsen, M. Chrosswhite,
I.B. Borecki, M.A. Province. 2007. "Rheumatoid arthritis, item response theory, Blom transformation, and mixed models." BMC Proceedings 2007
I(Suppl I):S116
Presentations: Accepted for Presentation:
October 2012: "Formalism and the Transition to Proof-Based Mathematics",
National Council of Teachers of Mathematics, Dallas, TX (jointly with Ann
Podleski)
Recent Presentations:
April 2012: "Embracing Open Source Materials in the Undergraduate
Curriculum," Mathematics Association of America, Missouri Section Meeting,
St. Louis, MO (jointly with Ann Podleski)
March 2012: "Modular arithmetic with Applications to Clocks and
Calendars," Beginning Teachers Assistance Program, Harris-Stowe State
University, St. Louis MO (jointly with Ann Podleski)
November 2011: "Genetics Models to Capstone an Introductory Statistics
Course," National Council of Teachers of Mathematics, Albuquerque, NM (jointly with Ann Podleski)
October 2011: "Hands-on Geometry and Topology", National Council of
Teachers of Mathematics, St. Louis, MO (jointly with Ann Podleski)
April 2010: "Every Equation Tells a Story", Faculty Research Seminar,
Harris-Stowe State University, St. Louis, MO
Also, oral and poster presentations at meetings of:
American Society of Human Genetics
International Genetic Epidemiology Society
Genetic Analysis Workshop
World Congress on Psychiatric Genetics
Certifications: Post-graduate work in Statistical Genetics in the Department of Psychiatry at Washington University, 1999-2002
EUSFLAT - LFA 2005
Multiobjective Formulations of Fuzzy Rule-Based Classification System Design
Hisao Ishibuchi and Yusuke Nojima
Graduate School of Engineering, Osaka Prefecture University, 1-1 Gakuen-cho, Sakai, Osaka 599-8531, Japan
{hisaoi, nojima}@cs.osakafu-u.ac.jp
Abstract
We examine several formulations of fuzzy rule selection for the design of fuzzy rule-based classification systems in our two-stage approach. The first stage is heuristic rule extraction where a large number of candidate rules are extracted. The second stage is evolutionary rule selection where fuzzy rule-based systems are constructed by choosing a small number of candidate rules. Rule selection is formulated as single-, two-, and three-objective optimization problems using an accuracy measure and two complexity measures.
Keywords: Fuzzy rule-based classification systems, Evolutionary multiobjective optimization, Accuracy-complexity tradeoff.
1 Introduction
Genetic algorithms have been used for the design of fuzzy rule-based systems in many studies [4]. While the accuracy maximization by genetic optimization has been mainly discussed in those studies, recently the existence of the accuracy-complexity tradeoff in the design of fuzzy rule-based systems has been realized by some researchers [1], [2]. The accuracy-complexity tradeoff can be handled in two different approaches. One approach is the use of the weighted sum of an accuracy measure and a complexity measure. The other approach uses a multiobjective formulation where an accuracy measure and a complexity measure are optimized by multiobjective optimization techniques. In this case, multiple non-dominated (i.e., Pareto optimal) fuzzy rule-based systems are obtained. Each fuzzy rule-based system corresponds to a different tradeoff solution of the
multiobjective optimization problem.
For the design of fuzzy rule-based classification systems, a weighted sum-based approach to fuzzy rule selection was proposed in Ishibuchi et al. [12], [13] where the number of correctly classified training patterns was maximized and the number of fuzzy rules was minimized. These two objectives were optimized by a two-objective genetic algorithm in [8]. The two-objective approach was further extended in [10] to the case of three objectives using an additional objective: to minimize the total number of antecedent conditions. A large number of non-dominated fuzzy rule-based classification systems were obtained from the three-objective formulation. The generalization ability of non-dominated fuzzy systems was examined in [14], [15]. A three-objective memetic algorithm was used to efficiently search for non-dominated fuzzy systems in [16].
In this paper, we examine several formulations of fuzzy rule selection for the design of fuzzy rule-based classification systems. We compare single-, two-, and three-objective formulations with each other through computational experiments on data sets from the UCI Machine Learning Repository. Experimental results demonstrate advantages and disadvantages of each formulation.
2 Fuzzy Rule-Based Classification Systems
In this section, we briefly explain fuzzy rule-based classification. For details, see the textbook on fuzzy data mining by Ishibuchi et al. [11].
Let us assume that we have m training patterns xp (xp 1, xp 2, ..., xpn) , p 1, 2, ..., m from M classes in the n-dimensional unit hyper-cube [0, 1]n. That is, our pattern classification problem is an M-class problem with m training patterns in the n
285
EUSFLAT - LFA 2005
dimensional pattern space [0, 1]n .
We use fuzzy if-then rules of the following type:
Rule Rq : If x1 is Aq1 and ... and xn is Aqn
then Class Cq with CFq, (1)
where Rq is the label of the q-th fuzzy rule, x (x1,..., xn) is an n-dimensional pattern vector, Aqi is an antecedent fuzzy set for the i-th attribute, Cq is a consequent class, and CFq is a certainty grade (i.e., rule weight). We denote the fuzzy rule in (1) as “ Aq Cq ”. As the antecedent fuzzy set Aqi , we use one of the 14 fuzzy sets in Fig. 1. The antecedent fuzzy set Aqi can be also “don’t care”. Thus the total number of combinations of the n antecedent fuzzy sets in (1) is 15n.
Figure 1: Antecedent fuzzy sets.
When the antecedent part of the fuzzy rule in (1) is given, the consequent class and the rule weight are determined in a heuristic manner from compatible training patterns. First we calculate the compatibility grade of each training pattern with the antecedent part using the product operation as
A x
q p
( ) 1 ( 1)
x
Aq p x
Aqn pn
( ), (2)
where Aqi ( ) is the membership function of the antecedent fuzzy set Aqi . Then we calculate the confidence for each class as follows:
A q p
( x
x p Class h
m
q(xp)
p
h 1, 2, ..., M. (3)
The consequent Cq of the fuzzy rule “Aq Cq” in (1) is determined by finding the class with the maximum confidence for the antecedent Aq as c(A C ) max
q q {c(A Class h)}
q . (4)
h1,2,..., M
We define the rule weight by the difference between the confidence of the consequent class and the sum of the confidences of the other classes as
M
CF c
q (A q Cq) c ( A Class
q h
h1
Cq
See [9], [11], [18] for other specifications of rule weights and their effects on the accuracy of fuzzy rule-based classification systems.
In this manner, the consequent class and the rule weight of each fuzzy rule can be easily determined from compatible training patterns. Let us denote the set of generated fuzzy rules by S. The rule set S can be viewed as a fuzzy rule-based classification system. The classification of an input pattern xp (xp 1, xp 2, ..., xpn) is performed in the fuzzy rule-based classification system S by choosing a single winner rule Rw from S as follows:
A x
w p
( ) CFw max { ( ) | }
A x
q p CF R S
q q . (6)
The input pattern xp is classified as Class Cw , which is the consequent class of the winner rule Rw . 3 Two-Stage Fuzzy Rule Selection Approach Fuzzy rule selection in this paper is to find non-dominated rule sets from the 15n fuzzy rules of the form in (1) with respect to accuracy and complexity. Since any subset of the 15n fuzzy rules can be represented by a binary string of length 15n, the size
of the search space is n
2 . Except for the case of 15
low-dimensional problems, it is very difficult to handle such a huge search space. Thus a two-stage fuzzy rule selection approach has been proposed in [11], [16]. The first stage is heuristic rule extraction where a tractable number of promising candidate rules are extracted from numerical data using a heuristic rule evaluation measure in the same manner as data mining. The second stage is evolutionary rule selection where evolutionary
286
EUSFLAT - LFA 2005
optimization algorithms are used to find non-dominated subsets of the extracted candidate rules with respect to accuracy and complexity.
A number of heuristic rule evaluation measures were examined in [17]. In this paper, we use the following rule evaluation measure:
M
f (R ) s(
q Aq Cq) s(Aq Class h), (7)
h1
In the second stage (i.e., evolutionary rule selection stage), evolutionary multiobjective optimization (EMO) algorithms are used to search for non-dominated rule sets (i.e., non-dominated binary strings of length N) with respect to accuracy and complexity. Formulations of fuzzy rule selection as multiobjective optimization problems are discussed in the next section. For EMO algorithms, see [3], [5]. We use the NSGA-II algorithm of Deb et al. [6].
h Cq
where s( ) is the support of fuzzy rules, which is defined as
The heuristic rule evaluation measure in (7) is a modified version of a fitness function used in an iterative fuzzy GBML (genetics-based machine learning) algorithm called SLAVE [7].
Using the rule evaluation measure in (7), we generate a prespecified number of promising candidate rules for each class. In this heuristic rule extraction stage, we only examine short fuzzy rules with a few antecedent conditions. This is because we want to construct interpretable fuzzy rule-based classification systems (i.e., because it is very difficult for human users to intuitively understand long fuzzy rules with many antecedent conditions). More specifically, we choose 300 rules with the largest values of the rule evaluation measure in (7) for each class among short rules of length three or less in our computational experiments except for the sonar data set with 60 attributes. For the sonar data set, we only examine short rules of length two or less. The total number of candidate rules is 300 M where M is the number of classes.
Let N be the total number of extracted candidate fuzzy rules (i.e., N 300 M in our computational experiments). Any subset S of the candidate fuzzy rules can be represented by a binary substring of length N as S s1s2 sN where sj 1 and sj 0 mean that the j-th candidate rule is included in S and excluded from S, respectively. Such a binary coding is used in the second stage of our two-stage fuzzy rule selection approach.
4 Formulations of Rule Selection
We use an accuracy measure f1( S) , which is the number of correctly classified training patterns by S, as in our former studies [8], [10]-[16]. We also use two complexity measures f2( S) and f3( S): f2( S) is the number of fuzzy rules in S, and f3( S) is the total number of antecedent conditions (i.e., f3( S) is the total rule length of fuzzy rules in S).
Several formulations are possible using these three measures in fuzzy rule selection. When we use all the three measures, we have the following three-objective optimization problem:
Maximize f1( S) and Minimize f2( S), f3( S). (9) We can formulate two two-objective optimization problems using one of the two complexity measures:
Maximize f1( S) and Minimize f2( S), (10)
Maximize f1( S) and Minimize f3( S). (11)
From each multiobjective optimization problem, we can formulate a single-objective maximization problem using a weighted sum fitness function as fitness(S)w f S w f S w f S
1 1 ( )
2 2 ( )
3 3 ( ), (12)
fitness(S) w1f 1 (S) w2f2(S), (13)
fitness(S) w1f 1 (S) w3f3(S). (14)
5 Comparison among Six Formulations Through computational experiments on six data sets in Table 1, we compared the six formulations of fuzzy rule selection in the previous section.
For the multiobjective optimization problems in (9)-(11), the NSGA-II algorithm [6] was executed using the following parameter specifications:
287
EUSFLAT - LFA 2005
Population size: 200 strings,
Crossover probability: 0.8 (uniform crossover),
Biased mutation probabilities:
pm(01)1/300 M and pm(1 0)0.1,
Stopping condition: 5000 generations.
We used a standard genetic algorithm with the same generation update scheme as the NSGA-II for the single-objective maximization problems in (12)-(14). The same parameter values were used in the NSGA-II algorithm and the standard genetic algorithm.
We mainly report experimental results on training patterns where we used all patterns in each data set as training patterns. Each formulation was examined by 20 independent runs on each data set. In Table 2, we show the average number of obtained non-dominated rule sets from each of the three multiobjective formulations. From this table, we can see that more non-dominated rule sets were obtained when f3( S) is used as a complexity measure. It should be noted that only a single rule set was always obtained from each of the single-objective formulations.
Table 1: Data sets in computational experiments.
Data set
Attributes Patterns Classes
Breast W 9 683* 2
Diabetes 8 768 2
Glass 9 214 6
Heart C 13 297* 5
Sonar 60 208 2
Wine 13 178 3
*Incomplete patterns with missing values are not included. Table 2: Average number of non-dominated rule sets.
Data set
{ f1 , f2 } { f1 , f3 } { f1 , f2, f3 }
Breast W 9.50 11.30 12.55
Diabetes 8.25 11.85 15.55
Glass 16.95 27.30 31.05
Heart C 28.60 48.65 49.60
Sonar 8.75 11.65 12.80
Wine 5.90 9.55 12.15
As shown in Table 2, a number of non-dominated rule sets were obtained by a single run of the NSGA-II algorithm. Examples of obtained rule sets by a
single run for the Glass data set are shown in Fig. 2 and Fig. 3. In Fig. 2, we show experimental results using the three-objective formulation and the two-objective formulation with f1( S) and f2( S) . On the other hand, experimental results in Fig. 3 are from the three-objective formulation and the two-objective formulation with f1( S) and f3( S).
Number of fuzzy rules
Figure 2: Non-dominated rule sets obtained by a single run for the three-objective formulation and the two-objective formulation with f1( S) and f2( S) . For comparison, the average result by the single-objective formulation in (12) is also shown.
32
Three-objective Two-objective Single-objective
26
24
22
20
1810 15 20 25 30 35 40 45 50 55
Total rule length
Figure 3: Non-dominated rule sets obtained by a single run for the three-objective formulation and the two-objective formulation with f1( S) and f3( S) . For comparison, the average result by the single-objective formulation in (12) is also shown.
288
EUSFLAT - LFA 2005
In Fig. 2 and Fig. 3, the same rule sets from the three-objective formulation are shown by small closed circles with different horizontal axes. From Fig. 2 and Fig. 3 (and from Table 2), we can see that more non-dominated rule sets are obtained from the three-objective formulation than the two-objective ones.
To examine the search ability of the NSGA-II algorithm, we examined the best rule set with respect to the accuracy among non-dominated rule sets obtained from each run. The average accuracy of the best rule set over 20 runs is shown in Table 3. For comparison, we also show the average accuracy over 20 runs for each of the three single-objective formulations in Table 4 and Table 5. Different weight values were used in each table. The average result in Table 4 for the Glass data set using the single-objective formulation in (12) is also shown in Fig. 2 and Fig. 3. From Tables 3-5 (and also from Fig. 2 and Fig. 3), we can see that the multiobjective formulation is inferior to the single-objective formulations in terms of accuracy for some data sets. This observation suggests the necessity of further improvement of the search ability of the NSGA-II algorithm.
Table 3: Average value of the best error rate among obtained non-dominated rule sets in each run.
Data set
{ f1, f2 } { f1 , f3 } { f1 , f2, f3 }
Breast W 1.76 1.83 1.78
Diabetes 22.15 22.08 22.23
Glass 21.36 21.40 21.47
Heart C 28.40 28.40 28.38
Sonar 10.63 10.53 10.72
Wine 0.00 0.00 0.00
Table 4: Average error rate by the three single-objective formulations (w1 100, w2 w3 1).
Data set
{ f1 , f2 } { f1 , f3 } { f1 , f2, f3 }
Breast W 1.79 1.81 1.80
Diabetes 21.96 22.01 22.01
Glass 20.44 20.51 20.61
Heart C 28.45 28.60 28.52
Sonar 9.59 9.95 9.69
Wine 0.00 0.00 0.00
Table 5: Average error rate by the three single-objective formulations ( w1 10, w2 w3 1).
Data set
{ f1, f2 } { f1, f3 } { f1, f2, f3 }
Breast W 1.77 1.79 1.79
Diabetes 22.08 22.11 22.02
Glass 20.37 20.72 20.47
Heart C 28.64 28.52 28.52
Sonar 9.76 9.71 9.71
Wine 0.00 0.00 0.00
While we can not observe any advantages of the three-objective formulation over the two-objective ones in experimental results on training data (except for the increase in the number of obtained non-dominated rule sets), the use of f3( S) together with f1( S) and f2( S) has a positive effect on generalization ability to test patterns for some data sets. In Fig. 4, we compare error rates of obtained non-dominated rule sets from the three-objective formulation and the two-objective one with f1( S) and f2( S). Fig. 4 is experimental results of a single run of each formulation for the diabetes data set with 50% training patterns and 50% test patterns. For comparison, Fig. 4 also shows the corresponding result by the single-objective formulation in (12).
Number of fuzzy rules
Figure 4: Generalization ability of non-dominated rule sets obtained by a single run for the diabetes data set with 50% training and 50% test patterns. The three-objective formulation is compared with the single-objective formulation in (12) and the two-objective formulation with f1( S) and f2( S) .
289
EUSFLAT - LFA 2005
6 Conclusions
In this paper, we compared six formulations of fuzzy rule selection. The main advantage of multiobjective formulations over single-objective ones is that multiple non-dominated rule sets are obtained from its single run, which visually show the accuracy-complexity tradeoff. Experimental results suggested that the increase in the number of objectives has a possibility to improve generalization ability of fuzzy rule-based classification systems while it makes the search for non-dominated rule sets difficult.
The authors would like to thank the financial support from the Okawa Foundation for Information and Telecommunications.
References
[1] J. Casillas, O. Cordon, F. Herrera, and L. Magdalena (eds.), Interpretability Issues in Fuzzy Modeling, Springer, 2003.
[2] J. Casillas, O. Cordon, F. Herrera, and L. Magdalena (eds.), Accuracy Improvements in Linguistic Fuzzy Modeling, Springer, 2003.
[3] C. A. Coello Coello, D. A. van Veldhuizen, and G. B. Lamont, Evolutionary Algorithms for Solving Multi-Objective Problems, Kluwer Academic Publishers, 2002.
[4] O. Cordon, F. Herrera, F. Hoffman, and L. Magdalena, Genetic Fuzzy Systems, World Scientific, 2001.
[5] K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms, John Wiley & Sons, 2001.
[6] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans. on Evolutionary Computation, vol. 6, no. 2, pp. 182-197, 2002.
[7] A. Gonzalez and R. Perez, “SLAVE: A genetic learning system based on an iterative approach,” IEEE Trans. on Fuzzy Systems, vol. 7, no. 2, pp. 176-191, 1999.
[8] H. Ishibuchi, T. Murata, and I. B. Turksen, “Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems,” Fuzzy Sets and
Systems, vol. 89, no. 2, pp. 135-150, 1997.
[9] H. Ishibuchi and T. Nakashima, “Effect of rule weights in fuzzy rule-based classification systems,” IEEE Trans. on Fuzzy Systems, vol. 9, no. 4, pp. 506-515, 2001.
[10] H. Ishibuchi, T. Nakashima, and T. Murata, “Three-objective genetics-based machine learning for linguistic rule extraction,” Information Sciences, vol. 136, no. 1-4, pp. 109-133, 2001.
[11] H. Ishibuchi, T. Nakashima, and M. Nii, Classification and Modeling with Linguistic Information Granules, Springer, 2004.
[12] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Construction of fuzzy classification systems with rectangular fuzzy rules using genetic algorithms,” Fuzzy Sets and Systems, vol. 65, no. 2/3, pp. 237-253, 1994.
[13] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Selecting fuzzy if-then rules for classification problems using genetic algorithms,” IEEE Trans. on Fuzzy Systems, vol. 3, no. 3, pp. 260-270, 1995.
[14] H. Ishibuchi and T. Yamamoto, “Effects of three-objective genetic rule selection on the generalization ability of fuzzy rule-based systems,” Lecture Notes in Computer Science 2632: EMO 2003, pp. 608-622, Springer, 2003.
[15] H. Ishibuchi and T. Yamamoto, “Evolutionary multiobjective optimization for generating an ensemble of fuzzy rule-based classifiers,” Lecture Notes in Computer Science 2723: GECCO 2003, pp. 1077-1088, Springer, 2003.
[16] H. Ishibuchi and T. Yamamoto, “Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining,” Fuzzy Sets and Systems, vol. 141, no. 1, pp. 59-88, 2004.
[17] H. Ishibuchi and T. Yamamoto, “Comparison of heuristic criteria for fuzzy rule selection in classification problems,” Fuzzy Optimization and Decision Making, vol. 3, no. 2, pp. 119-139, 2004.
[18] H. Ishibuchi and T. Yamamoto, “Rule weight specification in fuzzy rule-based classification systems,” IEEE Trans. on Fuzzy Systems (in press).
290
Recurrent Neural Networks for Modeling Motion Capture Data
Mir Khan, Heikki Huttunen, Olli Suominen and Atanas Gotchev
Laboratory of Signal Processing
Tampere University of Technology
Tampere, Finland
Email: mirabdul@tut.fi, heikki.huttunen@tut.fi, olli.j.suominen@tut.fi, atanas.gotchev@tut.fi
Abstract—Recurrent Neural Networks have recently received attention for human animation applications including motion synthesis; however, previous works did not provide any quan-titative approaches for evaluating the quality of the motion generated by these models. In this paper, we use three different recurrent neural network architectures for synthesizing human motion for a detailed skeleton with 64 joints. We introduce a novel motion quality metric for quantitatively evaluating the realism of the synthesized motion. We use this metric, among others, to compare the motion generated by the three network architectures and empirically study the impact of the network’s complexity on the quality of the motion.
Keywords: motion capture, recurrent neural network, generative model, long short-term memory
1. Introduction
Producing natural animations is important for many entertainment industries. Motion capture is one commonly used method for reproducing convincing animations by recording motion played by human actors. Motion capture data is used to animate 3D characters by mapping the same motion onto the virtual character. While motion capture offers many advantages, it suffers from several drawbacks. This includes the difficulty of reusing motion data for dif¬ferent scenarios. Therefore, it would be of great interest to the industry to generate animations through alternative approaches that do not rely heavily on human actors and manual processing.
One alternative approach to motion capture and manual animation is simulation-based methods for generating nat¬ural motion, which is an area closely related to robotics. The principle in these methods is to develop control strate¬gies for humanoids and imaginary creatures in a simulated environment through the use of reinforcement learning and optimization methods such as genetic algorithms. The con¬trol strategies are then optimized with respect to a criteria (often called a fitness function) such as total distance trav¬eled. An early example of these methods was introduced in [24], where virtual creatures with random morpholo¬gies develop relatively optimal control strategies such as swimming, running, jumping, and crawling depending on their environment, their own physical characteristics, and
the defined fitness function. Similar techniques have been used to produce more controlled behavior such as bipedal gait animations in a simulated environment [22]. Using bio-mechanical constraints have been shown to produce even more convincing and natural-looking animations for bipedal virtual creatures [10]. The main drawback of these techniques is that the generated motion is not often exactly what the animator may desire.
Techniques based on neural networks with multiple lay¬ers are known as deep learning. Neural networks, which are crude models of the information-processing mechanism of the biological brain, are extremely versatile machine learn¬ing tools with an ever-growing range of applications. Neu¬ral Networks techniques previously have been successfully applied for tasks such as image classification [17][1], face-recognition [18][21], audio classification [16], and speech recognition [9].
Deep learning techniques for synthesizing 3D human motion has drawn attention recently. An approach has been proposed in [14][15] to learn a space of valid human poses (called the motion manifold) using a convolutional-autoencoder network architecture. The framework can be used for novel motion synthesis, motion interpolation, and error-correction. A method that uses conditional Restricted Boltzmann Machines (cRBM) have been used to synthesize human gait animations [25].
Recurrent Neural Networks (RNNs) are a variation of standard neural networks with feedback loops where the inputs of previous time steps determine the output at fu-ture and present time steps. This temporal attribute of RNNs makes them well-suited for analyzing time-series data. RNNs have bee successfully used for automatic music composition and analysis [3][20], image synthesis [11], and handwriting generation [8].
A variant of RNNs, called Long Short-Term Memory (LSTM), has been proposed for solving several problems standard RNNs suffer from, among these are learning long¬term dependencies which LSTM networks are capable of for up to 1000 discrete time steps into the future [12]. More notably, LSTM networks solve the phenomena of vanishing gradient [13], which is a problem that arises in very deep neural networks, RNNs in particular. The main difference between a standard RNN and an LSTM network is the inclusion of memory states and gates. LSTM networks have
been previously used successfully for human motion syn¬thesis by training the network on a large dataset of human motion capture data with a variety of motion categories [4]. A similar approach was presented in [2] that uses a data set which consists of recordings of a dancer, which allows the network to generate dancing animations in a similar style.
Our work is most closely related to [2] and [4], but our focus is on motion synthesis for a detailed skeleton with 64 joints. More importantly, we study and compare the quality of the synthesized motion and present a quantitative evaluation of the motion generated by three different LSTM architectures. The purpose of this study is to determine the impact of the complexity of the model in terms of layers on the quality of the generated motion, and therefore provide a rigorous and quantitative justification for selecting a model for applications of motion synthesis.
2. Methods
2.1. Data preparation
Our data set consists of 5 hours of motion capture data recorded at 120 frames per second. Various motion categories are present in the data set such as walking, running, and dancing. The skeleton model of the motion capture data set is a 64-joint human skeleton model, which includes details such as hand fingers. The motion capture data is stored as a time-series of joint rotation angles in the files, but we transform this data to a time series of joint position coordinates. Thus, each time-step of the sequence is represented by 3 64 = 192-dimensional vector consisting of the concatenation of the x, y and z position coordinates of each joint. We construct the training set by extracting a fixed-length sequence using a 200-frame temporal window sliding at a temporal step size of 100 frames, resulting in 50% overlap between consecutive training samples. Each input sample X 8192×200 is a motion sequence given as
(1) (2) (3) (200)
xhips xhips xhips . . . x
hips (1) (2) (3) (200)
yhips yhips yhips ... yhips
z(1) (2) (3) (200)
hips zhips zhips... zhips
x(1) (2) (3) (200)
spine xspine xspine . . . xspine
.. . . . .
. . .
. .. . ..
x(1)
foot x(2) foot x(3)
foot . . . x(200) foot
(2)
y(1) (3) (200)
foot yfoot yfoot ... yfoot (2)
z(1) (3) (200)
foot zfoot zfoot... zfoot
where the horizontal dimension corresponds to the number of frames in the sequence, and the vertical dimension cor¬responds to the degrees of freedom of all 64 joints, i.e., the joint position coordinates. The subscript denotes the joint name according to the hierarchy defined in the motion data files. In our case, the first joint from which all other joints
extend is the hips joint. The superscript denotes the frame number and it corresponds to the horizontal axis of the matrix.
Each target output sample y 8192×1 in the training set is a vector, representing a single frame of motion, and it is the frame that follows the input sample X extracted from the same original motion file. Thus, a single training sample is given as the pair (X, y). In essence, the network is trained to complete the motion sequence it is given one frame at a time. These data preparation methods result in a final data set of nearly 15, 000 samples, of which 500 are reserved for validation and another 500 for testing and motion synthesis
To generate a motion sequence, first, the input X is simply fed to the network and the output is computed. Then, this output is concatenated with the previous input sequence X along the temporal axis, and it is shifted one frame forward, such that X will then contain the previous output y as its last frame while maintaining its length of 200 frames. This process is repeated for as many time steps as desired by feeding the newly constructed input sequence X to the network again to generate motion sequences of arbitrary length.
2.2. Network architecture
We use three network architectures which we will refer to as LSTM1, LSTM2, and LSTM3, where the post-fix denotes the number of layers in the network, each with 1000 LSTM units. Previous approaches in applying LSTM for motion capture used slightly more complex network architectures such as an encoder-decoder architecture in [4] and a Mixture-Density Network layer in [2]. For the purpose of analysis and comparison, we decided to keep the models simple so that our analysis captures the essential properties of these networks.
The weights for all layers are initialized according to [23] by generating an orthogonal matrix with a gain factor of 1.0. The weights matrix for the recurrent kernel is initialized by sampling a truncated normal distribution; this is known as a glorot normal initialization [6]. The bias values are all initialized to zeros. A linear activation function is used at the output layer of the network. All hidden layers use the hyperbolic tangent as their activation functions, and a hard sigmoid as the activation function for the recurrent step. The Mean-Squared Error (MSE) was used as the loss function and RMSprop [26] was chosen as an optimizer with the initial learning rate of 0.001. The network was trained with approximately 14, 000 samples using a mini-batch size of 32 samples. Each network was trained until the performance of the network (in the MSE sense) stopped showing any improvement.
3. Results
All three trained networks can generate novel motion sequences that complete the given input sequence, while
maintaining inter-joint relationships to varying degrees. Ex¬amples of the generated motions are included in the sup¬plementary materials. From visual observation alone, the LSTM network with 3 layers maintains inter-joint relation¬ships for the longest number of time steps and shows better motion variety. However, in order to make this analysis more rigorous, we perform quantitative evaluation of the quality of the motion.
Quantifying the quality of the generated motions is difficult due to the strongly qualitative nature of human motion. One way to measure the correctness of motion is by evaluating the network’s understanding of inter-joint relationships. This is done by computing the distance of each joint from its parent and taking the difference from the original distance of this link. We can further average this result over all the joints in the skeleton and take the absolute value to obtain an average of this measurement over all joints. We will denote this metric by the name Inter-Joint Variation (IJV). For a single frame, the IJV averaged over all joints is given by the expression
11ss()112 11vv()112, (2)
where the vectors s, s() R3 are the position coordinates of the joint i and joint i’s parent respectively. The total number of joints is denoted by K (this is 64 in our case). The function p(i) is a discrete-valued function that returns the identifying number for the parent of joint i. Similarly, v and v() are arbitrary position coordinates for the same joints and for the same skeleton (typically from the original recorded motion file) which can serve as the ground truth. It should be noted that the actual values of v and v() don’t matter and that it is the length of the link connecting these joints is what is important for this measurement. In our case, as ground truth, we simply use the inter-joint distances in the first frame in the sequence of interest. Figure 1 provides a visualization of this measurement for each model at each frame, averaged over all samples. It can be seen that the IJV values for LSTM3 grows slowest in comparison with the other two models.
Measuring the joint relationships on their own may not always be an accurate quantification of the quality of motion, since it is possible that the network outputs sequences with little to no movement, and yet small IJV values. Therefore, we use an additional metric that measures motion energy, which for a sample F R× can be computed as follows:
(F F1)2 (3)
Here, F is the element at row i and column j, m is the number of degrees-of-freedom of all joints (192 in our case), and n is the number of frames in the sequence. Consider the two graphs shown in Figure 3 illustrating the performance of each network (in the IJV sense) as we restrict the samples to a subset with mean energy exceeding the threshold on the horizontal axis (top). The graph at the bottom shows the
Figure 1. IJV measurements for each model, shown at each frame, averaged over all samples.
TABLE 1. ANALYSIS RESULTS OF THE MOTION SEQUENCES
GENERATED BY THE THREE MODELS, AVERAGED OVER ALL 500
SAMPLES AND ALL 400 FRAMES PER SAMPLE.
Avg. Measurements Over All Samples
Joint LSTM1 LSTM2 LSTM3 Ground Truth
IJVavg 1.150 cm 1.472 cm 0.939 cm 0.0 cm
MIDavg 6.675 cm 13.643 cm 6.975 cm 5.608 cm
Energyavg 1.128 6.306 1.028 0.409
Energystd 1.247 15.156 1.387 1.319
distribution of these energies, visualized in the same manner by representing each point in the graph as the measurements on the restricted subset of samples exceeding a mean energy threshold. The purpose of this analysis is to study how the network’s understanding of joint relationships changes in relation with the energy level. Additionally, it illustrates the proportion of the samples for which IJV measurements are made in the top graph. One can also think of the energy measurement as a measure of the average amount of motion in a sequence, such that motion sequences with faster movements will have more energy than motions with slower movements.
A straight-forward metric is the euclidean distance be¬tween consecutive frames, which can be a reasonable ap¬proach to quantify the motion similarity between consecu¬tive frames. We compute this result using the equation
(f f1) 2, (4)
where we denote by f R192×1 the j th frame in the sequence. This result can then be averaged over all samples. Table 1 shows these measurements for all samples for each model, in order to provide a general comparison of the
quality of the motion generated by each network. We convert the IJV results to the physical unit of centimeters in order to provide an intuitive sense of the errors. IJV and MID are calculated as shown before and averaged over all samples. On the third row, Energy shows the average energy as calculated by equation 3. The last row, Energy, shows the standard deviation of the energy of all samples. The fourth column shows these measurements for the 500 samples from the data set reserved for motion synthesis. It can be argued that LSTM3 shows the best capacity for novel and realistic motion synthesis and maintaining inter-joint relationships.
Figure 2 shows the IJV values for each joint averaged across all models and all samples. This illustration aims to highlights the joints which seem to be most problematic for our models to learn. One possible explanation for the severity of the errors at the fingers is that finger motions can be very complex, while leg joints, for example, remain mostly similar over the data set.
Figure 2. IJV values averaged across all models for each joint.
Figure 3. IJV at each frame averaged over all samples for each model (top) and its relationship with the energy measurements shown in the bottom graph.
4. Conclusion
We have studied three different LSTM models for human motion synthesis for a detailed skeleton with 64 joints. Anal¬ysis shows that the three-layered LSTM architecture, with 1000 nodes in each layer, produces motions that are most realistic when compared to the single-layer and the 2-layer LSTM networks. The LSTM3 network can maintain good inter-joint relationships which extends up to 400 frames. Analysis has also shown that, overall, these models can reasonably accurately maintain joint relations for the spine, neck and legs, but they faces some difficulty in maintaining these relations for the more detailed segments of the skeleton such as hand fingers.
We can also observe that the two-layered LSTM archi¬tecture is an exceptionally bad model for human motion synthesis, even more so than the single-layered model. This may be due to the fact that the two-layer model is not robust against over-learning (like LSTM1), while it still does not possess the expressive power that the LSTM3 has. However, more research is needed in order to confirm this observation and to consider even deeper models than the three-layer model.
In the future, we aim to make use of a larger dataset and of specific motion types. Such a data set may allow more control over the motion categories generated. More specifically, we wish to build on the work in [2], which uses a data set of recordings of a dancer, and extend it to other
motion categories such as walking, sprinting, crawling, or even different dancing styles.
Additionally, we wish to examine the effect of larger models in terms of layers and number of neurons in each layer on maintaining joint relations. We expect that deeper and more complex models will show better capacity for maintaining detailed joint relations such as hand fingers and feet.
Finally, we plan to investigate the impact of semantically-rooted regularization techniques. The intuitive motivation behind this is to allow incorrect outputs (in the MSE sense) of the network that honor inter-joint relations to have less of an impact on the direction of the gradient while training. This could however, in theory, result in a worst case scenario where the network always outputs the same pose, but further studies are needed.
5. Supplementary Material
Video links showing the motion sequences generated by each model. Sequences with the blue skeleton are the input sequences feed to the network. Motion sequences in green are generated by the network.
• LSTM1
• LSTM2
• LSTM3
Acknowledgments
This work was supported by the research infrastructure of Center for Immersive Visual Technologies CIVIT, Tam¬pere University of Technology. The training and testing data was collected and provided by Keho Interactive Oy.
References
[1] Ciregan D, Meier U, Schmidhuber J. Multi-column deep neural net-works for image classification. In Computer Vision and Pattern Recog-nition (CVPR), 2012 IEEE Conference on 2012.
[2] Crnkovic-Friis L, Crnkovic-Friis L. Generative Choreography using Deep Learning. arXiv preprint arXiv:1605.06921. 2016.
[3] Eck D, Schmidhuber J. A first look at music composition using lstm recurrent neural networks. Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale. 2002.
[4] Fragkiadaki K, Levine S, Felsen P, Malik J. Recurrent network models for human dynamics. In Proceedings of the IEEE International Con-ference on Computer Vision 2015.
[5] Gers FA, Schmidhuber J, Cummins F. Learning to forget: Continual prediction with LSTM. Neural computation. 2000.
[6] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In Aistats 2010 (Vol. 9, pp. 249-256).
[7] Glorot X, Bordes A, Bengio Y. Deep Sparse Rectifier Neural Networks. In Aistats 2011 (Vol. 15, No. 106, p. 275).
[8] Graves A. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850. 2013.
[9] Graves A, Jaitly N. Towards End-To-End Speech Recognition with Recurrent Neural Networks. In ICML 2014 (Vol. 14, pp. 1764-1772).
[10] Geijtenbeek T, van de Panne M, van der Stappen AF. Flexible muscle-based locomotion for bipedal creatures. ACM Transactions on Graphics (TOG). 2013;32(6):206.
[11] Gregor K, Danihelka I, Graves A, Rezende DJ, Wierstra D. DRAW: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623. 2015.
[12] Hochreiter S, Schmidhuber J. Long short-term memory. Neural com-putation. 1997 Nov 15;9(8):1735-80.
[13] Hochreiter S. The vanishing gradient problem during learning recur-rent neural nets and problem solutions. International Journal of Uncer¬tainty, Fuzziness and Knowledge-Based Systems. 1998;6(02):107-16.
[14] Holden D, Saito J, Komura T, Joyce T. Learning motion manifolds with convolutional autoencoders. In SIGGRAPH Asia 2015 Technical Briefs 2015 (p. 18).
[15] Holden D, Saito J, Komura T. A deep learning framework for char-acter motion synthesis and editing. ACM Transactions on Graphics (TOG). 2016;35(4):138.
[16] Kanda N, Takeda R, Obuchi Y. Elastic spectral distortion for low resource speech recognition with deep neural networks. In Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on 2013 (pp. 309-314).
[17] Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems 2012 (pp. 1097-1105).
[18] Lawrence S, Giles CL, Tsoi AC, Back AD. Face recognition: A convolutional neural-network approach. IEEE transactions on neural networks. 1997;8(1):98-113.
[19] Le QV, Jaitly N, Hinton GE. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941. 2015.
[20] Nayebi A, Vitelli M. GRUV: Algorithmic Music Generation using Recurrent Neural Networks. 2015.
[21] Parkhi OM, Vedaldi A, Zisserman A. Deep Face Recognition. In BMVC 2015 (Vol. 1, No. 3, p. 6).
[22] Reil T, Husbands P. Evolution of central pattern generators for bipedal walking in a real-time physics environment. IEEE Transactions on Evolutionary Computation. 2002;6(2):159-68.
[23] Saxe AM, McClelland JL, Ganguli S. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120. 2013.
[24] Sims K. Evolving virtual creatures. In Proceedings of the 21st annual conference on Computer graphics and interactive techniques 1994 (pp. 15-22).
[25] Taylor GW, Hinton GE. Factored conditional restricted Boltzmann machines for modeling motion style. In Proceedings of the 26th annual international conference on machine learning 2009 (pp. 1025-1032).
[26] Tieleman T, Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning. 2012;4(2).
7th INTERNATIONAL CONFERENCE ON FINANCIAL CRIMINOLOGY 2015
13-14 April 2015,Wadham College, Oxford, United Kingdom
Perceived “Tone From The Top” During A Fraud Risk Assessment
Geetha A Rubasundram *
School of Accounting, Finance and Quantitative Studies, Asia Pacific University, Technology Park Malaysia, 57000, Kuala Lumpur
Abstract
In recent years, the focus on good governance and control mechanisms has increased significantly due to the number of high-profile corporate failures caused by top management fraudulent acts. Recent fraud cases reflect the deceptive “tone from the top”, even though the organisation had reported good governance and control systems. This research analysed the importance of the perceived management support and its effect on the organisation culture during a fraud risk assessment. The study used Action Research as the researcher intended to have a more in-depth study on the factors affecting organisational fraud.
© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of ACCOUNTING RESEARCH INSTITUTE, UNIVERSITI TEKNOLOGI MARA Keywords: Fraud Risk Assessment;Internal Controls; Culture; Tone from the Top; Corporate Executive Fraud;Occupational Fraud.
1. Introduction
Fraud is an intentional act designed to deceive others, resulting in the victim suffering a loss after relying on the deceit and the perpetrator achieving a gain (AICPA, 2008). This research paper focuses on fraud committed by top managers (corporate executives), also known as white-collar crime. A white-collar crime is committed by a person of respectability and high social status; in the course of his occupation (Sutherland, 1949). Choo & Tan (2007) define Corporate Executive Fraud as an intentional financial misrepresentation by trusted executives of public companies.
It is vital to differentiate between the various classifications of white-collar crimes as each crime would reflect different red flag, motivations and circumstances. Clinard & Quinney (1973) differentiated occupational fraud from
* Corresponding author. Tel.: +60126021394.
E-mail address: geetha@apiit.edu.my
2212-5671 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of ACCOUNTING RESEARCH INSTITUTE, UNIVERSITI TEKNOLOGI MARA doi:10.1016/S2212-5671(15)01087-4
Geetha A. Rubasundram / Procedia Economics and Finance 28 ( 2015 ) 102 – 106 103
corporate crimes. They claim that occupational crimes reflect on the organisation as the victim, with the crimes being committed against the organisation for the benefit of the perpetrator. Wells (2007) agrees by defining Occupational Fraud as the misuse of one’s occupation in order to achieve personal enrichment through the deliberate misuse or misapplication of the employing organizations resources or assets. Corporate crimes, on the other hand are committed by the perpetrator for the benefit of the corporation (Clinard & Quinney, 1973). Zahra et al (2005) mention that white-collar crimes can also be classified according to the extent of individual involvement. The research cites the work of Daboub et al (1995), which distinguishes between active participation (individuals are actively involved in illegal activities) and passive acquiescence(managers are aware of illegality within the organisation, but are unwilling to take corrective action); Kelman & Hamilton (1989) which discuss about the “Crimes of Obedience” ( Individual caught in the dilemma of either carrying out a directive that is wrong or disobeying an order and suffering the consequences) and Hamilton & Sanders (1999) “Second Face of Evil”(The result of actions of individuals who occupy positions and make routine decisions according to procedure, which eventually escalates into either cover ups or disasters). This research provides an overview of a fraud risk assessment and organizational culture of a selected case study.
2. Fraud Risk Assessment
Implementing anti-fraud controls appear to have helped reduce the cost and duration of fraud schemes (ACFE, 2014). However, the perceived mechanisms of a good governance and control system can be misleading. For example, Enron had all the mechanisms of a good governance and control system, however due to the management culture and decisions, the fraudulent activities took place. Gebler (2005) states that culture is the leading risk factor for compromising integrity and compliance in companies.
Too many controls could cause beaurecracy, causing additional cost and time. CIMA (2009) states that the Internal Control System consists of an organisation’s policies and procedures that when taken together, support an organisation’s effective and efficient operation. COSO (2012) defines an Internal Control System as a process, effected by an entity's board of directors, management, and other personnel, designed to provide reasonable assurance regarding the achievement of objectives relating to operations, reporting and compliance.
One of the risks that the organisation faces which may affect the achievement of the objectives set is the risk of fraud. CIMA and COSO recommend a fraud risk assessment to ensure proper focus, identification, and management of fraud risk in an organisation. O’ Bell (2009) believes that the Fraud Risk Assessment helps to focus management’s attention on the significant fraud risks to be addressed and managed.
The researcher used a framework summarised by Rubasundram (2014) for the Fraud Risk Assessment:
a) Management to initiate the Fraud Risk Assessment & Management by setting up a Fraud Risk Assessment & Management Team (FRAM Team) to set goals, objectives and the fraud risk appetite of the organization.
b) FRAM Team to carry out brainstorming activities, process mapping, necessary checks, audits & tests and discussions / interviews with other personnel to understand:
i. The organizations environment, culture, management, business, departments, processes, functions and process owners.
ii. Potential fraud scenarios and schemes, taking into account red flag areas such as management overriding of controls and personnel who may exhibit the factors identified in the Fraud Triangle, Fraud Diamond, The New Fraud Triangle and the new Fraud Diamond. Heiman-Hoffman and Morgan (1996) ranked the most important warning signs (red flag) as management attitude.
iii. Categorise and assess the likelihood & impact of the fraud schemes.
iv. Review the current controls in place and identify the gap in terms of any additional or revised controls needed, in line with the organizations risk appetite and decided risk strategy.
v. Implement and monitor controls with periodic evaluation.
3. Development of Fraud Theories
The question arises on what would make management violate their position of trust. The writer agrees with Zahra et al (2005) that management fraud is often committed by highly successful people, who have everything to
104 Geetha A. Rubasundram / Procedia Economics and Finance 28 ( 2015 ) 102 – 106
lose if discovered. Cressey (1973) concluded that trusted persons become trust violators when they view themselves as having a non-shareable financial problem that is solvable using unethical means by violating their position of trust. Choo & Tan (2007) described the term “Broken Trust Theory”, in relation to the work of Albrecht et al (2004) to explain Corporate Executive Fraud. The research had linked the Agency Theory and Stewardship theory to the Fraud Triangle concept. The agency theory (Jensen & Meckling, 1976) describe the principal-agent relationship between owners and executives, with top executives acting as agents whose personal interests do not naturally align with shareholder interests. Stewardship theory views corporate executives as stewards of their companies who will choose the interests of the stockholders over the interests of self, regardless of personal motivations or incentives (Sundaramurthy & Lewis, 2003). These two theories seek to align the interests of corporate executives with the shareholders and question the complexity of corporate executives’ behaviour.
Over the years, Donald Cressey’s hypothesis, popularly known as the Fraud Triangle identified three variables representing pressures or motivation to carry out the fraud, the opportunity to carry out the fraud and the final variable being to rationalise the act as being inconsistent with one’s personal level of integrity. Albrecht et al (1984) states that it appears that the three elements (opportunity, pressure and rationalisation) must be present for a fraud to be committed. It is interesting to note Kassem & Higson (2012) expression of Cressey’s rationalisation concept, “most fraudsters are first time offenders with no criminal record” and that the trust violators knew the behaviour to be wrong and that they merely kidded themselves into thinking that it was not illegal”. ACFE’s, Report to the Nations 2014 reflects that 86% of perpetrators had never been charged or convicted. In Cressey’s theory, the pressure component initially reflected a non-shareable financial problem which was then developed to include non¬financial components (Albrecht et al: 2008, 2010), political and social pressure and subsequently; personal pressure, corporate/employment pressure and external pressure (Kassem & Higson, 2012). CIMA (2009) describes opportunity for fraud takes place in companies where there is a weak internal control system, poor security over company property, little fear of exposure and likelihood of detection, or unclear policies to concerning acceptable behaviour.
Critics of the fraud triangle argued that it provides a limited perspective since it is difficult to observe the rationalisation and pressure effect, and it ignores important factors like the capabilities of the fraudster, culture etc. Wolfe & Hermanson (2004) introduced the Fraud Diamond, extending the Fraud Triangle to include fraudster capabilities. The capability component takes into account the fraudster’s position or function within the organisation which may furnish the ability to create or exploit an opportunity for fraud not available to others, which also includes the fraudster’s ability to exploit internal control weakness. The fraudsters’ personality also plays a role in this model. Wolfe & Hermanson (2004) identify ego and confidence of non-detection or non-penalisation as well as the ability to coerce others to commit or conceal fraud; which is in line with Kelman & Hamilton (1989) “Crimes of Obedience”.
Finally, Wolfe & Hermanson also believe that the fraudster has the ability to lie effectively and consistently; and to deal with stress accordingly. Kassem & Higson (2012) discuss the New Fraud Triangle Model in their paper, citing Dorminey et al (2010). In this model, they suggest to expand the motivation component of the original fraud triangle, to include Money, Ideology, Coercion and Ego (MICE). Ideological motivators are able to justify the fraud by believing that their action would bring greater benefits, consistent with their beliefs (ideology). The model includes a personal integrity scale instead of rationalisation, introduced by Albrecht et al (1984). Personal integrity is seen as the personal code of ethical behaviour each person adopts. It also includes the fraudster’s capability as the development of the Fraud Diamond. Gbegi & Adebisi (2013) “The New Fraud Diamond Model” includes the National Value Systems (NAVS+MICE) and Corporate Governance as additional elements.
4. Methodology
This research will analyse the organisational culture based on the dimensions identified by Berry (2004), Glover & Auno (1995) and Gebler (2005). Gebler (2005) assesses the culture from the angle of the overall organisation culture, the employees’ culture and the managers’ culture. Berry (2004) discusses the seven dimensions of organisational culture which include 1.Vigilance 2.Engagement 3.Credibility 4.Accountability 5.Empowerment 6.Courage 7.Options. Gebler (2005) states that an ethical organisation would reflect the following seven levels: 1.Social Responsibility 2.Sustainability 3.Alignment 4. Accountability 5.Systems & Processes 6.Communication 7.Financial Stability. The researchers focus on the employee reflection process via a Culture Risk Assessment is relevant to reflect on the ethical culture of the organisation as a whole. Glover & Aono (1995) discuss that the
Geetha A. Rubasundram / Procedia Economics and Finance 28 ( 2015 ) 102 – 106 105
corporate culture review should include a review of employee turnover rates at all levels, reasons for turnover, average employee tenure, nature and magnitude of customer complaints, product quality and related warranty expense experiences, nature of legal battles, employee morale as noted via observation, grievances, attendance, newsletter comments, comparison of wages with industry averages, credit rating, better business bureau reports, employee benefits and investment in employees such as training and office support. Glover & Auno (1995) also assert that the auditor should assess management philosophy as documented in their policies and procedures manual, including an analysis of the corporate board minutes.
Each fraud risk assessment normally reflects different circumstances and evidences. Action Research enables research to take place in real life situations and to solve actual problems or issues. Therefore, the researcher needed to have access into an organisation process, policy, documents, employees etc.; and also to be involved with the decision-making process, policy setting, and implementation and monitoring. It was also crucial that the organisation reflected red flags (symptoms of fraud), especially by top management. The organisation selected also faced cash flow and financial issues due to fast paced growth, which was in line with the researchers criteria since the temptation to commit fraud would be higher in a financially troubled company. It also had a multi cultural employee base which added value for the cultural risk assessment. The red flags that were noted include limited communication channels between the board, executive management, and department heads. The Management Information Systems had performance issues led to incomplete accounting records and reports. This was aggravated by high turnover especially amongst accounting personnel.
5. Results
Based on the initial assessment, the possibility of fraud involving the top executives seemed high. However, the FRAM team concluded that the motive was more towards ego (not wanting to loose reputation) and the ideology of the betterment of the organisation i.e. towards the inclination of Corporate Executive Fraud. The top executives were extremely supportive of the overall FRA initially; in their attempt to be perceived as accountable and transparent. The second level managers (operational) were rather divided, and were passively acquiescence; following the Crimes of Obedience. During the assessment, it was interesting to note that the operational managers focused on lower level employee fraud as the issue, rather than on the top executives until prompted and trained.
The employees were from a variety of culture and background, and generally conformed to the organisational culture. The management and culture of the organisation reflected an ethical culture, with strong controls and procedures. However, the group of top-level executives dominated the organisation and demanded obedience. Most employees were of the opinion that their job security mattered most, which led them to carry out certain practises which they were not too comfortable with. Training the employees improved the culture and empowerment factor over the period of time. Although they were aware of the ethical standards and their roles in upholding these standards (vigilance), most employees did not engage themselves due to lack of credibility in the overall organisations commitment. Since it did not involve them directly, they felt they were not accountable beyond their work neither were they empowered to disclose the same. Since the employees felt divided, it was easy to note that they lacked the courage and initiative to rectify the culture, especially since they were not aware of any other options available to them. .
6. Conclusion
Setting up Internal Controls to prevent fraudulent activities in an organization can only be effective when the tone from the top is strong. Until the committee was set up to oversee the management, there was a lot of conflict and attempts to override the controls set in place, even though the organisation had already set up some interim controls. The main issue that the team had to contend with was the perceived top executive attitude, the lack of commitment from the other employees and overriding of controls. However, with the gradual education and training of the employees, eventually this was minimised. The FRAM Team did note that without the support of the Directors, the successful implementation might not have been possible as the Executives would have eventually overridden the controls. Clear policies and the provision of options of whistleblowing, as well as access to Directors improved the overall perception of the culture and employees morale. This was noted during post implementation with managers and other employees taking a pro-active stand against unethical decisions.
106 Geetha A. Rubasundram / Procedia Economics and Finance 28 ( 2015 ) 102 – 106
References
ACFE: Association Of Certified Fraud Examiners 2014. Report to the Nation 2014.
AICPA et al 2008. Managing the Business Risk of Fraud: A Practical Guide
Albrecht,C., Turnbull, C., Zhang, Y., Skousen, C.J. 2010. The relationship between South Korean Chaebols and Fraud: Managerial Auditing
Journal, Vol. 33(3)
Albrecht, S., Howe, K., Romney. M., 1984.Deterring Fraud: The Internal Auditors Perspective, Institute of Internal Auditors Research
Foundation
Albrecht,W.S., Albrecht, C.C., Albrecht, C.O. 2008.“Current trends in fraud and its detection. Information Security Journal : A Global
Perspective, Vol 17
Berry, Benisa,.2004.“Organisational Culture: A framework and strategies for facilitating employee whistleblowing”, Employee Responsibilities
and Rights Journal, Vol 16
Choo, F, Tan, K,.2007. An American Dream Theory of Corporate Executive Fraud, Elsevier – Accounting Forum.
CIMA: Chartered Institute Of Management ACCOUNTANTS 2009, CID Tech Guide: Fraud Risk Management
Clinard, M.B., Quinney, R,.1973. Criminal Behaviour Systems, A typology New York: Holt, Rinehart & Winston
Committee Of Sponsoring Organizations Of The Treadway Commission (COSO) 2012. Enterprise Risk Management–Understanding &
Communicating Risk Appetite
Cressey, D.R 1973, Other people’s money: Montclair: Patterson Smith
Daboub,A.J, Rasheed, A.M.A, Priem, R.L., Gray, D.,1995. Top Management Team Characteristics & Corporate Rate Illegal Activity. Academy
of Management Review
Dorminey, J., Fleming, S., Kranacher, M.,Riley, R., 2011.“The Evolution of Fraud Theory”. American Accounting Association Annual Meeting,
Denver, Aug. pp 1-58
Gbegi,D.O, Adebsi, J.F., 2013. “The new fraud diamond model – how can it help forensic accountants in fraud investigation in Nigeria”,
European Journal of Accounting, Auditing and Finance Research
Gebler, David 2005. Is Your Culture A Risk Factor? Using Culture Risk Assessments To Measure The Effectiveness of Ethics & Compliance
programs (www.workingvalues.com – last accessed 11th March 2015)
Glover H.D., Aono, J.Y 1995.“Changing the model for prevention and detection of fraud”, Managerial Auditing Journal, Vol. 10
Hamilton, V.L, Sanders, J., 1999. The second face of evil: Wrongdoing in and by the corporation. Personality and Social Psychology Review 3:
222-233
Heiman-Hoffman,V., K.P. Morgan,. 1996 The Warning Signs of Fraudulent Financial Reporting, Journal of Accountancy (October 1996):75-77
Jensen, M.C., Meckling, W.H., 1976. Theory of the firm: Managerial Behavior, Agency Costs and Ownership Structure. Journal of Financial
Economics,3 305-360
Kassem, R., Higson, A.W.,2012. The New Fraud Triangle Model. Journal of Emerging Trends in Economics and Management Sciences, 3 (3), pp
191-195
Kelman, H.C., Hamilton, V.L. 1989. Crimes of Obedience. New Haven CT Yale University Press
O’ Bell Erick 2009. “5 Anti Fraud Strategies to Deter, Prevent and Detect Fraud”, Corporate Compliance Insight
Rubasundram, Geetha., 2014. Fraud Risk Assessment: A Tool For SME’s To Identify Effective Controls. Research Journal of Accounting &
Finance
Sundaramurthy, C., Lewis, M.,2003. Control and collaboration: Paradoxes of government. Academy of Management Review 28, 397-416
Sutherland, E.H., 1949. White collar crime. New York: Dryden Press
Wells, Joseph, T. 2007. Corporate Fraud Handbook – Prevention and Detection, 2nd Edition, John Wiley & Sons
Wolfe, D.T, Hermanson, D.R 2004. The Fraud Diamond: Considering the Four Elements of Fraud. “CPA Journal 74.12 (2004) 38-42
Zahra, S.A., Priem, R.L., Rashees, A.A., 2005. “The antecedents and consequences of top management fraud”, Journal of Management 2005
31:803
Analysis of Netflix architecture and business model
Elena Oat
Aalto University School of Science
elena.oat@aalto.fi
Abstract
Advances in technology and current capabilities of home networks allow people to watch their favourite shows in the comfort of their own household at any time of the day in exchange for a low fee. Moreover, the same video content is accessible on a range of mobile devices while away from home. Video-On-Demand (VoD) and, specifically, stream¬ing video technology enables its users to access content in¬stantly and provides other convenient functionalities, such as rewind, pause, etc. At the moment the number of companies is large, creating a highly competitive market in respective area. This motivates the market players to innovate, develop their products and provide better service to its customers in order to survive the competition. Among those is Net-flix, the leading streaming service provider, who has world’s largest customer base. This paper provides an overview of VoD technology and an analysis of Netflix case. The study identifies factors that drove the service provider to its leading position on the streaming video market.
KEYWORDS: Netflix, VoD, streaming video, cloud, CDN
1 Introduction
With continuously rising bandwidth, devices’ processing power and advancing communication technologies, VoD has become a viable service for home entertainment purposes, distance learning, as well as digital commerce. VoD offers a possibility to watch TV shows at any time and as many times as desired. Additionally, VoD consumers are able to use VCR functionalities of which they are fond: rewind, pause, fast forward, etc. Moreover, some VoD operators pro-vide their customers with access to multimedia content on the move via their mobile gadgets (smartphones, tablets) as long as their network connectivity quality is high enough for streaming.
VoD is delivered to its consumers in a variety of ways, which results in different user experience and quality of ser¬vice. Multimedia content is either fully downloaded to a storage and viewed afterwards, or accessed already in the process of its download. In the latter case the first part of me¬dia is watched while the following bits of the entire content are being downloaded. This type of VoD is called streaming video. An advantage of such a technology is instantaneous content availability on devices that support it. At the same time the quality depends mostly on the data specifics that the device is enabled with. This means less frustration for users, as the waiting times are considerably reduced comparing to
technology where the video is first fully downloaded and be¬comes available for watching only afterwards.
The market of VoD in Western Europe and North Amer¬ica has a rather wide range of providers that compete for dominance and customer base by providing unlimited con¬tent packages at lowest prices, as well as content diversity and exclusiveness (e.g. "House of Cards" accessible on Net-flix, which holds exclusive rights to stream the series). Net-flix - the world’s largest online video service [2], Amazon’s LoveFilm, HBO, Warner Bros, Viaplay, Hulu and Voddler, to name a few. Due to fierce competition in the market, these companies try to ensure that their consumers are satisfied and spend the least amount of time on content search and config¬uration. They do this by using suggestion options that are delivered by the customer’s own preferences, as well via so¬cial networks.
The purpose of this study is to analyse the deciding factors that contributed to success of the Netflix Internet television network and its VoD service from the technical and busi¬ness point of view. The paper is divided into the following sections: Introduction, Background, Netflix, Discussion and Conclusion. Introduction and Background parts present the general idea of VoD service, including the technical aspect. The Netflix section focuses on Netflix operability. The Dis-cussion section contains thoughts about how and why Netflix has succeeded. Finally, the last section concludes the paper.
2 Background
VoD grew in popularity enormously over the last years due to convenience and value of its services: entertainment, dis¬tance learning, video-conferencing. In essence, VoD repre¬sents a mix of services and technologies: video compression, multimedia storage, video transmission and video reception. Each of these components went through several phases of modifications and improvements, which lead to VoD service viability.
2.1 VoD architectures and cost classification
VoD services are categorized by their system architecture. According to Mir et al. [10] following are VoD relevant ar¬chitecture types: centralized, proxy-based, distributed, peer-to-peer (P2P), Content Delivery Network (CDN) and hybrid. In the first case, a central unit called video server, plays a role of content disseminator and serves requests from content consumers. It also acts as a multimedia repository. Exam¬ples of services that utilize similar architecture are YouTube and CNN Pipeline, according to study done by A. Vinay et
Aalto University T-110.5191 Seminar on Internetworking Spring 2013
al [16]. The distributed architecture, on the other hand, re¬moves the dependency on one central unit and moves the multimedia content to a set of computers which are situated in different geographical locations. This type of architecture is superior in scalability and efficiency over the centralized one.
A hybrid architecture is being considered as another archi¬tecture design, in which previous two are combined. Peer-assisted VoD described by Huang et al. in [7] is an example of architecture where traditional client-server model is re¬placed with a decentralized one. In a peer-assisted network, a client who consumes data is uploading at the same time content to other peers who requested identical multimedia file. In a similar fashion, a peer receiving data from another peer acts as a source provider to the other ones. This ap¬proach reduces both the strain put on a server and reduces the bandwidth costs for the provider of the service.
Additionally, VoD services are classified by their cost. Consumers are provided with a choice of pay-per-view in the case of Nearly VoD (NVoD), or are presented with a pos¬sibility of unlimited access to content - Subscription VoD (SVoD), a service in which users pay a monthly fee and are not charged per watched piece and its popularity. At the same time, VoD providers could offer free viewings of the less popular or not very recent films as long as they are their customers already. NVoD is currently losing its popular¬ity amongst its customers because of its limiting availability (films could be watched only at a particular time if enough users sign up for them), while SVoD is gaining traction.
2.2 Problems faced and solutions
Although the latest technologies and capabilities of the net¬work provide better possibilities than the older generations, VoD still faces problems related to delivery of content over the network. Additionally, the number of users has increased too. To ensure high quality video for its consumers, VoD providers have to invest in solutions that allow quick and re¬liable delivery of content, which are costly.
Operation efficiency of a VoD company depends on the architecture that it uses. The scientific literature has stud¬ied broadly P2P and hybrid architectures to reduce the price and load on central servers, CDNs and proxy servers. A P2P solution completely removes central units from its architec¬ture, thus nodes share same privileges and responsibilities of a client and a server. This architecture ensures load balanc¬ing, but at the same time creates other problems: participants can join and leave the network at unpredictable times, result¬ing in instability. On the other hand, a hybrid architecture represents a compromise between P2P and centralized mod¬els. In such an architecture, participants and consumers of the content play a role of multimedia storage units, as well as its distributors, while keeping central components present.
Besides load balancing and scalability challenges that a centralized VoD system faces, it is, conversely, not utilized to its full capability at other times. Thus, investing in an expen¬sive infrastructure, whose full potential is not always used is wasteful. A paper by Li et al. [9] studies how cloud services, such as Amazon AWS and Microsoft’s Azure, are brought into VoD architecture to cope with uneven traffic and save
operator’s expenses. The paper proposes a cloud-assisted so¬lution, where clients are partly served by provider’s servers and partly by the cloud. Thus, such a model is composed of VoD provider’s servers, cloud storage, cloud CDNs and clients. Cloud CDNs allow fast content delivery, as the net¬work consists of a multitude of edge servers which serve clients closest to them. They also save costs related to band¬width (pay by byte), even though renting of a cloud infras¬tructure could be more expensive than owning one. Further¬more, cloud solutions alleviate traffic bursts.
One of the solutions that Netflix - one of the leading VoD providers - has chosen to solve the above mentioned chal¬lenges was to swap from their own infrastructure to cloud in 2010. According to the company’s techblog article [4] by J. Ciancutti, such functionalities as search engines, recommen¬dation systems, streaming servers, content stores, database solutions, etc. were deployed in Amazon Web Services (AWS). Migration to cloud was implemented due to ne-cessity of continuous scalability, reliability and availability. AWS allows access to additional storage and other resources almost instantly, in comparison to the data-center solution, where infrastructure has to be planned beforehand and can¬not be changed dramatically in a quick way. According to Netflix, predicting the future growth is a complex task and provides imprecise results. Whereas, AWS alleviates chal¬lenges related to customer base prediction.
In a different article on the same blog by the same author J. Ciancutti [3], Netflix presents brief overview of challenges it had to undergo in the implementation phases of migration to AWS. Among these are problems related to co-tenancy, because the resources in AWS are shared.
Another factor that has to be taken into account when de¬signing an efficient VoD architecture is video popularity. Ob¬viously, the most popular videos need to be stored in several locations, for fast access of multimedia content. On the other hand, cloud storage is expensive and, additionally, the popu¬larity of videos fades away quickly. More than one tenth of Hulu top videos are replaced by others every hour, according to studies by H. Li et all [9]. Thus, an optimal update algo-rithm has to be considered, in accordance with which video content needs to be updated on the peripheral servers.
2.3 Competition coming from torrents
An important aspect that fits into context of VoD service is competition between VoD operators that deliver legal con¬tent to its consumers versus illegal content sharing sites, en¬abled by BitTorrent P2P sharing protocol. In order to be profitable, VoD services have to beat its illegal opponents and provide better, more efficient and simple-to-use service, so that people would be willing to choose paying money in¬stead of seeking free alternative delivered by BitTorrent.
As stated by A. Kosnik [8], multimedia streaming services have yet to work on their attractiveness. It appears that it’s easier to install a file sharing client and download from a rich selection almost any film.
Overall, file sharing sites are more flexible from many points of view. There is no dependency on regions and the content that is available for them, larger content availability, no time limitations, as most of the multimedia is uploaded
Aalto University T-110.5191 Seminar on Internetworking Spring 2013
almost immediately after its appearance. On the other hand, VoD customers will have to wait for a definite amount of time before the videos are made available to them.
Besides, VoD providers lack consistency in their user in¬terfaces. Thus, if a client wants to access a media file on one of operators sites, he or she will have to learn first how to use it, what features are available, learn search options, etc. Later, if the same clients decide to switch to another provider, due to lack of desired content, they have to start again from the beginning. Moreover, a video downloaded by P2P file client sharing is not limited to only specific de¬vices and players on which videos can playback, as in case of VoD content (for example products bought on iTunes can be played only on a proprietary iTunes media player on MAC or Windows, while Linux is not supported). Additionally, content uploaded on pirate networks is commercial-free, as these are cut from media usually before being uploaded.
As noted by A. Kosnik [8], VoD technology, could, in fact, benefit and learn from experience shared by the P2P file shar¬ing community. Fortunately, technologies and standards im¬plemented by this community are open and available to the public. Streaming services operators could build on already successful solutions and provide additional functionalities, that would make their services more attractive. One exam¬ple could be permanent access for any media file, even older ones. This seems to be a problematic task for torrent net¬works, as there might be no seeders - peers that provide video source - present.
Another big advantage of torrents over streaming is that files are downloaded locally and can be watched offline, which results also in smoother user experience, as there is no latency caused by network congestion. VoD providers might also gain customer base by providing both streaming and download options.
3 Netflix
The main goal of this study is to present a brief overview of technologies, business and marketing decisions Netflix - largest VoD provider - has taken, that made an impact on its current leadership.
3.1 Architecture
A clear and well defined study has been done by Adhikari et al. [1] on Netflix network architecture. According to authors, there are four main components that play a role in the overall system operation: a player (Silverlight for desktop comput¬ers), Content Delivery Networks (CDNs) that perform deliv¬ery of the streaming content to the client, Amazon’s cloud services and data centers that belong to Netflix.
Silverlight player is supported by most browsers. It down¬loads, decodes and plays the video requested by a con¬sumer. Although this is a problem-free procedure for Win¬dows and Mac users, Linux users still have to face problems when watching videos provided by Netflix. There are sev¬eral workarounds for accessing VoD content. One of them is installing Windows on a VM and watching Netflix from it. Another one is installing Netflix Desktop App described
by [15]. However, neither of these solutions are officially supported by the VoD provider.
Netflix uses three Content Delivery Networks for content streaming: Level3, LimeLight and Akamai. Each of them have a rank specified in the manifest file that is downloaded by the player before the content is being streamed. The rank number determines the order of preference for choosing the CDN by the client.
Netflix’s own servers perform the actions of registering users and receiving the payments from its customers. Later they redirect the user for signup or for content streaming from Amazon’s cloud machines. Most of the activity hap¬pens exactly here - in the cloud: log recording, user sign-in, DRM, CDN routing operations, etc.
3.2 Netflix technologies
Netflix is an innovative company both in the sense of tech¬nology and business. This conclusion can be drawn if one follows their Tech Blog available online, as well judging by decisions they have made during the years of business evolu¬tion. HTML5 technology is present in Netflix user interface (UI). UI is accessed by customers on their Netflix Ready De¬vices (PS3, XBOX, etc.). This allows Netflix engineers to modify features of the user interface seamlessly, i.e. con¬sumers will not have to download new software, it will be available at once next time they access the interface. This way Netflix also keeps up with the latest web technologies and innovates its products by implementing them.
However, Netflix is not able to use HTML5 for video play¬back yet. This is due to challenges related to standardization of adaptive streaming in HTML5. Adaptive streaming is a streaming method over HTTP, which implies video deliv¬ery at a suitable bit rate for a client. This bit rate is calcu¬lated according to client’s bandwidth and CPU characteris¬tics and is being adjusted in real time in conformance with available resources. Currently, Microsoft Silverlight is the only container supported by Netflix for video streaming on web browsers. Mobile devices use native applications for video playback, except Samsung’s Chromebook which now streams content provided by Netflix via HTML5. ARM-based Chromebook users do not need to install any plugins or additional software on their Chrome OS as mentioned by Google groups [5] in order to enjoy Netflix streaming ser¬vice.
Streaming strategy used by Netflix influences the amount of traffic that is being transferred. Engineers have to keep in mind that transferring a huge amount of data at once might overwhelm the client with data. At the same time, it was proven that some of the videos are never watched till the end and are interrupted at some point of time, because their viewers lose interest. Therefore, it is unnecessary to send big chunks of data and waste resources, if there is a significant possibility the data will never be used. Moreover, since 25¬40% of data traffic on Internet is due to video streaming, this subject becomes even more relevant. Additionally, authors of Sandvine’s report [14] state that Netflix is one of the dom¬inant streaming sources in North America: more than 30% of downstream traffic during peak periods. Moreover, same report provides a clear picture of how Netflix outperforms
Aalto University T-110.5191 Seminar on Internetworking Spring 2013
its rivals in terms of traffic share. This index is 18, 20 and 60 times higher compared to Amazon Video, Hulu and HBO Go, respectively.
Figure 1: Report[14]
Streaming process happens, typically, in two stages: buffering and steady state. During the buffering phase video is downloaded at full bandwidth capability, whereas at the steady state phase download happens in ON-OFF cycles. Thus during steady state phase a block of data is fetched from the video servers, after which follows a period of idleness. This mechanism saves media player from overload, as well as saves traffic in case user decides to drop video watching.
Rao et al. [13] present results of a study on streaming strategies used by Netflix. According to it, ready state phase is characterized by short ON-OFF cycles in case of stream¬ing on browsers and on native iPad Netflix app. However, in case of Android native app, streaming happens in long cy¬cles. At the same time, data downloaded during buffering stage differs as well for different applications. For instance, for iPad the amount of data fetched while buffering is 4 times smaller than for an Android device.
3.3 Business model
Netflix started as a video rental business in 1997. The busi¬ness has been adapting, though, during its course of develop¬ment. A clear example of this is how Netflix has changed the DVD delivery method. Users submit their requests for DVDs online, which are delivered by post to their homes in a day or two in a reusable for return envelope. Moreover, rented content can be kept for as long as desired. All these bring convenience to its end customers, which are more likely to remain loyal to the service.
The subscription model that Netflix chose is another factor that contributes to its success. It removes the trouble with following the amount of watched content and worries related to high bills. Netflix charges 7,99 dollars in the United States and slightly more in Europe for its unlimited content access on any device, - as long as the content is accessed only on one of these devices at the same time. The fee is affordable, and the idea of unlimited watch time explains that it’s a leading streaming service. To compare, a bought DVD costs more.
To keep its customers interested, Netflix invests in new and exclusive content. For instance, in the end of 2012 it signed licensing agreement with The Walt Disney Co., ac¬cording to which films by Disney, Walt Disney Animation
Studios, Pixar Animation Studios, Marvel Studios, and Dis-neynature will be streamed exclusively by Netflix starting in 2016. In fact, the growing video content expenses raise wor¬ries among investors, as Netflix customer base will have to increase rapidly in order to cover them.
To prove how valuable is the content that Netflix offers to its subscribers at the moment, the company published an in-fographic in the letter to its shareholders for the fourth quar¬ter of 2012. The picture presents how many out of top 200 Netflix titles are also streamed by other VoD services.
Figure 2: Report[12]
Netflix, whose customer base is more than 33 million streaming members worldwide as stated on its own page [11], considers thoughtfully customers’ opinions and satis¬faction. In 2006 it announced a "Netflix Prize" competi¬tion whose winner got 1 million dollars. Competition’s pur¬pose was to find a recommendation algorithm that provided higher results than Netflix’s own recommendation system Cinematch. The idea behind the competition was to forecast more precisely videos users would prefer to watch and rec¬ommend these on Netflix page. Hence, this results in better customer satisfaction.
Another significant and recent step that Netflix has taken in pleasing its users is introducing to its customers "binge" watching possibility. The whole season of exclusive series "House of Cards", premiered in February and that has re¬ceived highly positive response from its viewers, was avail¬able on its online service. Thus, Netflix has chosen cus¬tomer satisfaction over short-term profit that could be ob¬tained by releasing one episode per week, as viewers would have to remain connected to the service in order to watch more episodes.
According to S&P 500 stock market index for the first quarter of 2013, Netflix leads the list of best-performing stocks. It’s share price grew over 100 percent. Next are pre¬sented top 5 best-performing stocks in U.S.
Aalto University T-110.5191 Seminar on Internetworking Spring 2013
Figure 3: Report[6]
4 Discussion
Marketing and business decisions are key to Netflix corpo-ration’s growth. Probably one of the most significant ones is providing unlimited view time of its content at a relatively low monthly subscription fee. Hence, customers are saved from worries about their constantly growing bill. Addition¬ally, films are accessible on a large set of devices. However, Linux is still not officially supported by Netflix service, but this is characteristic to other VoD providers.
Netflix provides a very convenient and affordable service, but if people have not tried or heard of it, they will be outside its loyal customer base. To solve this limitation, Netflix of¬fered a free one-month service in countries where it has just entered the market, for instance in Scandinavian countries. In addition, Netflix non-members, i.e. those who haven’t signed up for the service, were also provided with the oppor¬tunity to view one episode of its successful "House of Cards" series in order to lure them in to sign-up.
Netflix success stems from several factors. It has proved to be a very adaptive business, which steps in rhythm with technology and latest developments, being among the first to implement newest concepts and, thus, becoming a model for other businesses.
Besides implementing the most recent technologies, Net-flix introduces innovative elements in business. One of the examples of its innovative thinking is their DVD by mail rental method, which was copied by other companies, such as Blockbuster, to drive similar high demand in DVD rental. At the same time, the company experiments in a diversity of areas, but then closely watches obtained results in order to minimize its losses in case an idea fails.
Moreover, some of the company’s failures serve as a les¬son for future improvement, as in the case of AWS outages. These motivate the business to act proactively and implement changes and improvements that will eliminate service loss or reduce it to minimum in force-majeur situations. One of the methods used by Netflix is splitting the service by regions, so that failure in one zone does not affect other zones. Another practice that supports service resiliency is writing an inci¬dent report after each service failure and analysing it later to identify aspects that need to be handled better to prevent their further malfunctioning.
5 Conclusion
An increasing number of VoD providers on the market cre¬ates favourable conditions for progress and better quality of service, while at the same time creates challenges for the market players. High competition is one of the drivers of innovation and development. Netflix, the leading streaming
provider, implements innovative methods and latest technol¬ogy to keep up with the competition. It is proactive by plan¬ning ahead for future growth and investing in high-scalability solutions. It also considers customer satisfaction, by invest¬ing in development of recommendation systems, while at the same time enriching its video content database. Although Netflix’s customer base grew significantly during the last year, higher than predicted, it still faces challenges posed by competitors and the P2P sharing community. The service will have to develop further and learn from its rivals in some aspects in order to maintain its position.
References
[1] V. Adhikari, Y. Guo, F. Hao, M. Varvello, V. Hilt, M. Steiner, and Z.-L. Zhang. Unreeling Netflix: Un-derstanding and improving multi-CDN movie delivery. In INFOCOM, 2012 Proceedings IEEE, pages 1620 – 1628, March 2012.
[2] Bloomberg. Netflix Subscriber Gain of 2.05M Beats Expectations . Technical report, January 2013. http: //www.bloomberg.com/video/. Resource last accessed 10.03.2013.
[3] J. Ciancutti. 5 Lessons We have Learned Us
ing AWS . Technical report, December 2010.
http://techblog.netflix.com/2010/12/5-lessons-we’ ve-learned-using-aws.html. Resource last accessed 10.03.2013.
[4] J. Ciancutti. Four Reasons We Choose Amazon’s Cloud as Our Computing Platform. Technical report, December 2010. http://techblog.netflix.com/2010/12/ four-reasons-we-choose-amazons-cloud-as.html. Re¬source last accessed 10.03.2013.
[5] M. Daniels. Netflix comes to the New Samsung
Chromebook. Technical report. https://groups.
google.com/forum/#!topic/chromebook-central/ c4p5DdehuHs. Resource last accessed 03.04.2013.
[6] FactSet. Technical report, March 2013. http://www. factset.com. Resource last accessed 03.04.2013.
[7] C. Huang, J. Li, and K. W. Ross. Can Internet Video
on-Demand be profitable? In Proceedings of the
2007 conference on Applications, technologies, archi¬tectures, and protocols for computer communications, SIGCOMM ’07, pages 133–144, New York, NY, USA, 2007. ACM.
[8] A. D. Kosnik. Piracy is the future of televi
sion. Technical report, March 2010. http://boletines. prisadigital.com/piracy_future_television-full.pdf. Re¬source last accessed 08.03.2013.
[9] H. Li, L. Zhong, J. Liu, B. Li, and K. Xu. Cost-Effective Partial Migration of VoD Services to Content Clouds. In Cloud Computing (CLOUD), 2011 IEEE In¬ternational Conference on, pages 203–210, July 2011.
Aalto University T-110.5191 Seminar on Internetworking Spring 2013
[10] N. Mir, M. Nataraja, and S. Ravikrishnan. A Perfor-mance Evaluation Study of Video-on-Demand Traffic over IP Networks. In Advanced Information Network¬ing and Applications (WAINA), 2011 IEEE Workshops of International Conference on, March 2011.
[11] Netflix. Company Facts. Technical report. https: //signup.netflix.com/MediaCenter/Facts. Resource last accessed 13.03.2013.
[12] Netflix. Q4 12 letter to shareholders. Technical report, January 2013. http://ir.netflix.com/. Resource last ac¬cessed 03.04.2013.
[13] A. Rao, A. Legout, Y.-s. Lim, D. Towsley, C. Barakat, and W. Dabbous. Network characteristics of video streaming traffic. In Proceedings of the Seventh Confer¬ence on emerging Networking Experiments and Tech¬nologies, CoNEXT ’11, pages 25:1–25:12, New York, NY, USA, 2011. ACM.
[14] Sandvine. Global Internet Phenomena Report. Techni
cal report, 2012. http://www.sandvine.com/downloads/
documents/Phenomena_2H_2012/Sandvine_Global_
Internet_Phenomena_Report_2H_2012.pdf. Resource
last accessed 18.03.2013.
[15] TechRepublic. How to get Netflix streaming on
Ubuntu 12.10. Technical report, December 2012.
http://www.techrepublic.com/blog/opensource/
how-to-get-netflix-streaming-on-ubuntu-1210/4019.
Resource last accessed 20.03.2013.
[16] A. Vinay, P. Saxena, and T. Anitha. An efficient video streaming architecture for Video-on-Demand systems. In Signal and Image Processing (ICSIP), 2010 Interna¬tional Conference, pages 102 –107, December 2010.
UNSUPERVISED LEARNING OF SPARSE FEATURES FOR SCALABLE
AUDIO CLASSIFICATION
Mikael Henaff, Kevin Jarrett, Koray Kavukcuoglu and Yann LeCun
Courant Institute of Mathematical Sciences
New York University
mbh305@nyu.edu ; yann@cs.nyu.edu
ABSTRACT
In this work we present a system to automatically learn fea¬tures from audio in an unsupervised manner. Our method first learns an overcomplete dictionary which can be used to sparsely decompose log-scaled spectrograms. It then trains an efficient encoder which quickly maps new inputs to ap¬proximations of their sparse representations using the learned dictionary. This avoids expensive iterative procedures usu¬ally required to infer sparse codes. We then use these sparse codes as inputs for a linear Support Vector Machine (SVM). Our system achieves 83.4% accuracy in predicting genres on the GTZAN dataset, which is competitive with current state-of-the-art approaches. Furthermore, the use of a sim¬ple linear classifier combined with a fast feature extraction system allows our approach to scale well to large datasets.
1. INTRODUCTION
Over the past several years much research has been devoted to designing feature extraction systems to address the many challenging problems in music information retrieval (MIR). Considerable progress has been made using task-dependent features that rely on hand-crafted signal processing tech¬niques (see [13] and [26] for reviews). An alternative ap¬proach is to use features that are instead learned automat¬ically. This has the advantage of generalizing well to new tasks, particularly if the features are learned in an unsuper-vised manner.
Several systems to automatically learn useful features from data have been proposed over the years. Recently, Restricted Boltzmann Machines (RBMs), Deep Belief Networks (DBNs) and sparse coding (SC) algorithms have enjoyed a good deal of attention in the computer vision community. These have
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.
c 2011 International Society for Music Information Retrieval.
led to solid and state-of-the-art results on several object recog¬nition benchmarks [8,15, 23, 30].
Some of these methods have also began receiving inter¬est as means to automatically learn features from audio data. The authors of [22] explored the use of sparse coding using learned dictionaries in the time domain, for the purposes of genre recognition. Convolutional DBNs were used in [16] to learn features from speech and music spectrograms in an unsupervised manner. Using a similar method, but with su¬pervised fine-tuning, the authors in [12] were able to achieve 84.3% accuracy on the Tzanetakis genre dataset, which is one of the best reported results to date.
Despite their theoretical appeal, systems to automatically learn features also bring a specific set of challenges. One drawback of DBNs noted by the authors of [12] were their long training times, as well as the large number of hyper-parameters to tune. Furthermore, several authors using sparse coding algorithms have found that once the dictionary is learned, inferring sparse representations of new inputs can be slow, as it usually relies on some kind of iterative proce¬dure [14, 22, 30]. This in turn can limit the real-time appli¬cations or scalability of the system.
In this paper, we investigate a sparse coding method called Predictive Sparse Decomposition (PSD) [11, 14,15] that at¬tempts to automatically learn useful features from audio data, while addressing some of these drawbacks. Like many sparse coding algorithms, it involves learning a dictionary from a corpus of unlabeled data, such that new inputs can be rep¬resented as sparse linear combinations of the dictionary’s elements. It differs in that it also trains an encoder that ef¬ficiently maps new inputs to approximations of their opti¬mal sparse representations using the learned dictionary. As a result, once the dictionary is learned inferring the sparse representations of new inputs is very efficient, making the system scalable and suitable for real-time applications.
2. THE ALGORITHM
2.1 Sparse Coding Algorithms
The main idea behind sparse coding is to express signals x Rn as sparse linear combinations of basis functions
chosen out of an overcomplete set. Letting B Rn×m (n < m) denote the matrix consisting of basis functions bj Rn as columns with weights z = (z1, ..., zm), this relationship can be written as:
zjbj = Bz (1)
where most of the zi’s are zero. Overcomplete sparse representations tend to be good features for classification systems, as they provide a succinct representation of the sig¬nal, are robust to noise, and are more likely to be linearly separable due to their high dimensionality.
Directly inferring the optimal sparse representation z of a signal x given a dictionary B requires a combinatorial search, intractable in high dimensional spaces. Therefore, various alternatives have been proposed. Matching Pursuit methods [21] offer a greedy approximation to the solution. Another popular approach, called Basis Pursuit [7], involves minimizing the loss function:
Ld(x, z, B) = 2 ||x Bz||2
1 2 + λ||z||1 (2)
with respect to z. Here λ is a hyper-parameter setting the tradeoff between accurate approximation of the signal and sparsity of the solution. It has been shown that the solu¬tion to (2) is the same as the optimal solution, provided it is sparse enough [10]. A number of works have focused on efficiently solving this problem [1, 7,17, 20], however they still rely on a computationally expensive iterative procedure which limits the system’s scalability and real-time applica-tions.
2.2 Learning Dictionaries
In classical sparse coding, the dictionary is composed of known functions such as sinusoids, gammatones, wavelets or Gabors. One can also learn dictionaries that are adaptive to the type of data at hand. This is done by first initializing the basis functions to random unit vectors, and then iterating the following procedure:
1. Get a sample signal x from the training set
2. Calculate its optimal sparse code z* by minimizing (2) with respect to z. Simple optimization methods such as gradient descent can be used, or more sophis¬ticated approaches such as [1,7, 20].
3. Keeping z* fixed, update B with one step of stochas¬tic gradient descent: B +- B ν Ld
B , where Ld is the loss function in (2). The columns of B are then rescaled to unit norm, to avoid trivial minimizations of the loss function where the code coefficients go to zero while the bases are scaled up.
4
2
0
2
4
5 4 3 2 1 0 1 2 3 4
Figure 1. Shrinkage function with θ = 1
There is evidence that sparse coding could be a strategy employed by the brain in the early stages of visual and audi¬tory processing. The authors in [24] found that basis func¬tions learned on natural images using the above procedure resembled the receptive fields in the visual cortex. In an analogous experiment [28], basis functions learned on nat¬ural sounds were found to be highly similar to gammatone functions, which have been used to model the action of the basilar membrane in the inner ear.
2.3 Predictive Sparse Decomposition
In order to avoid the iterative procedure typically required to infer sparse codes, several works have focused on de¬veloping nonlinear, trainable encoders which can quickly map inputs to approximations of their optimal sparse codes [11, 14, 15]. The encoder’s architecture is denoted z = fe(x, U), where x is an input signal, z is an approxima¬tion of its sparse code, and U collectively designates all the trainable parameters of the encoder. Training the encoder is performed by minimizing the encoder loss Le(x, U), de¬fined as the squared error between the predicted code z and the optimal sparse code z* obtained by minimizing (2), for every input signal x in the training set:
Le(x, U) = 2||z* fe(x, U)||2
1 (3)
Specifically, the encoder is trained by iterating the fol¬lowing process:
1. Get a sample signal x from the training set and com¬pute its optimal sparse code z* as described in the pre¬vious section.
2. Keeping z* fixed, update U with one step of stochas¬tic gradient descent: U +- U ν Le
U , where Le is
the loss function in (3).
In this paper, we adopt a simple encoder architecture given by:
fe(x, W, b) = hθ(Wx + b) (4)
where W is a filter matrix, b is a vector of trainable bi¬ases and hθ is the shrinkage function given by hθ(x)i =
sgn(xi)(xi 0i)+ (Figure 1). The shrinkage function sets any code components below a certain threshold 0 to zero, which helps ensure that the predicted code will be sparse. Training the encoder is done by iterating the above process, with U = W, b, 0. Note that once the encoder is trained, inferring sparse codes is very efficient, as it essentially re¬quires a single matrix-vector multiplication.
3. LEARNING AUDIO FEATURES
In this section we describe the features learned on music data using PSD.
3.1 Dataset
We used the GTZAN dataset first introduced in [29], which has since been used in several works as a benchmark for the genre recognition task [2, 3, 6, 12, 18, 25]. The dataset consists of 1000 30-second audio clips, each belonging to one of 10 genres: blues, classical, country, disco, hiphop, jazz, metal, pop, reggae and rock. The classes are balanced so that there are 100 clips from each genre. All clips are sampled at 22050 Hz.
3.2 Preprocessing
To begin with, we divided each clip into short frames of 1024 samples each, corresponding to 46.4ms of audio. There was a 50% overlap between consecutive frames. We then applied a Constant-Q transform (CQT) to each frame, with 96 filters spanning four octaves from C2 to C6 at quarter-tone resolution. For this we used the toolbox provided by the authors of [27]. An important property of the CQT is that the center frequencies of the filters are logarithmically spaced, so that consecutive notes in the musical scale are linearly spaced. We then applied subtractive and divisive local contrast normalization (LCN) as described in [15], which consisted of two stages. First, from each point in the CQT spectrogram we subtracted the average of its neigh-borhood along both the time and frequency axes, weighted by a Gaussian window. Each point was then divided by the standard deviation of the new neighborhood, again weighted by a Gaussian window. This enforces competition between neighboring points in the spectrogram, so that low-energy signals are amplified while high-energy ones are muted. The entire process can be seen as a simple form of automatic gain control.
3.3 Features Learned on Frames
We then learned dictionaries on all frames in the dataset, us¬ing the process described in 2.2. The dictionary size was set to 512, so as to get overcomplete representations. Once the dictionary was learned, we trained the encoder to pre¬dict sparse representations using the process in 2.3. In both
(a)
(b)
(c)
(d)
(e)
(f)
Figure 3. Some of the functions learned on individual oc¬taves. The horizontal axis represents log-frequency. Recall that each octave consists of 24 channels a quarter tone apart. Channel numbers corresponding to peaks are indicated. a) A minor third (two notes 3 semitones apart) b) A perfect fourth (two notes 5 semitones apart) c) A perfect fifth (two notes 7 semitones apart) d) A quartal chord (each note is 5 semitones apart) e) A major triad f) A percussive sound.
to sounds caused by percussive instruments. 3.5 Feature Extraction
Once the dictionaries were learned and the encoders trained to accurately predict sparse codes, we ran all inputs through their respective encoders to obtain their sparse representa¬tions using the learned dictionaries. In the case of dictio¬naries learned on individual octaves, for each frame we con¬catenated the sparse representations of each of its four oc¬taves, all of length 128, into a single vector of size 512. Ex¬tracting sparse features for the entire dataset, which contains over 8 hours of audio, took less than 3 minutes, which shows that this feature extraction system is scalable to industrial-size music databases.
4. CLASSIFICATION USING LEARNED FEATURES
We now describe the results of using our learned features as inputs for genre classification. We used a linear Support Vector Machine (SVM) as a classifier, using the LIBSVM library [5]. Linear SVMs are fast to train and scale well to large datasets, which is an important consideration in MIR.
4.1 Aggregated Features
Several authors have found that aggregating frame-level fea¬tures over longer time windows substantially improves clas¬sification performance [2, 3, 12]. Adopting a similar ap¬proach, we computed aggregate features for each song by summing up sparse codes over 5-second time windows over¬lapping by half. We applied absolute value rectification to the codes beforehand to prevent components of different sign from canceling each other out. Since each sparse code records which dictionary elements are present in a given CQT frame, these aggregate feature vectors can be thought of as histograms recording the number of occurrences of each dictionary element in the time window.
4.2 Classification
To produce predictions for each song, we voted over all ag¬gregate feature vectors in the song and chose the genre with the highest number of votes. Following standard practice, classification performance was measured by 10-fold cross-validation. For each fold, 100 songs were randomly selected to serve as a test set, with the remaining 900 serving as train¬ing data. This procedure was repeated 10 times, and the re¬sults averaged to produce a final classification accuracy.
Our classification results, along with several other results from the literature, are shown in Figure 4. We see that PSD features learned on individual octaves perform significantly better than those learned on entire frames. 1 Furthermore,
1 In an effort to capture chords which might be split among two of the octaves, we also tried dividing the frequency range into 7 octaves, overlap¬ping by half, and similarly learning features on each one. However, this did
Classifier Features Acc. (%)
CSC Many features [6] 92.7
SRC Auditory cortical feat. [25] 92
RBF-SVM Learned using DBN [12] 84.3
Linear SVM Learned using PSD on octaves 83.4 3.1
AdaBoost Many features [2] 83
Linear SVM Learned using PSD on frames 79.4 2.8
SVM Daubechies Wavelets [19] 78.5
Log. Reg. Spectral Covariance [3] 77
LDA MFCC + other [18] 71
Linear SVM Auditory cortical feat. [25] 70
GMM MFCC + other [29] 61
Figure 4. Genre recognition accuracy of various algorithms on the GTZAN dataset. Our results with standard deviations are marked in bold.
our approach outperforms many existing systems which use hand-crafted features. The two systems that significantly outperform our own rely on sophisticated classifiers based on sparse representations (SRC) or compressive sampling (CSC). The fact that our method is still able to reach compet¬itive performance while using a simple classifier indicates that the features learned were able to capture useful proper¬ties of the audio that distinguish between genres. One possi¬ble interpretation is that some of the basis functions depicted in Figure 3 represent chords specific to certain genres. For example, perfect fifths (e.g. power chords) are very com¬mon in rock, blues and country, but rare in jazz, whereas quartal chords, which are common in jazz and classical, are seldom found in rock or blues.
4.3 Discussion
Our results show that automatic feature learning is a viable alternative to using hand-crafted features. Our approach per¬formed better than most systems that pair signal processing feature extractors with standard classifiers such as SVMs, Nearest Neighbors or Gaussian Mixture Models. Another positive point is that our feature extraction system is very fast, and the use of a simple linear SVM makes this method viable on any size dataset. Furthermore, the fact that the fea¬tures are learned in an unsupervised manner means that they are not limited to a particular task, and could be used for other MIR tasks such as chord recognition or autotagging.
We also found that features learned on octaves performed better than features learned on entire frames. This could be due to the fact that in the second case we are learning four times as many parameters as in the first, which could lead to overfitting. Another possibility is that features learned on octaves tend to capture relationships between fundamental notes, whereas features learned on entire frames also seem
not yield an increase in accuracy.
to capture patterns between fundamentals and their harmon¬ics, which could be less useful for distinguishing between genres.
One aspect that needs mentioning is that since we per-formed the unsupervised feature learning on the entire dataset (which includes the training and test sets without labels for each of the cross-validation folds), our system is technically akin to “transductive learning”. Under this paradigm, test samples are known in advance, and the system is simply asked to produce labels for them. We subsequently con¬ducted a single experiment in which features were learned on the training set only, and obtained an accuracy of 80%. Though less than our overall accuracy, this result is still within the range observed during the 10 different cross-validation experiments, which went from 77% to 87%. The seemingly large deviation in accuracy is likely due to the variation of class distributions between folds.
There are a number of directions in which we would like to extend this work. A first step would be to apply our sys¬tem to different MIR tasks, such as autotagging. Further¬more, the small size of the GTZAN dataset does not ex¬ploit the system’s ability to leverage large amounts of data in a tractable amount of time. For this, the Million Song Dataset [4] would be ideal.
A limitation of our system is that it ignores temporal de¬pendencies between frames. A possible remedy would be to learn features on time-frequency patches instead. Pre¬liminary experiments we conducted in this direction did not yield improved results, as many ’learned’ basis functions resembled noise. This requires further investigation. We could also try training a second layer of feature extractors on top of the first, since a number of works have demon¬strated that using multiple layers can improve classification performance [12,15, 16].
5. CONCLUSION
In this paper, we have investigated the ability for PSD to automatically learn useful features from constant-Q spec¬trograms. We found that the features learned capture infor¬mation about which chords are being played in a particular frame. Furthermore, these learned features can perform at least as well as hand-crafted features for the task of genre recognition. Finally, the system we proposed is fast and uses a simple linear classifier which scales well to large datasets.
In future work, we will apply this method to larger datasets, as well as a wider range of MIR tasks. We will also exper¬iment with different ways of capturing temporal dependen¬cies between frames. Finally, we will investigate using hi¬erarchical systems of feature extractors to learn higher-level features.
6. REFERENCES
[1] A. Beck and M..Teboulle: “A Fast iterative Shrinkage-Thresholding Algorithm with Application to Wavelet-Based Image Deblurring,” ICASSP ’09 , pp. 696-696, 2009.
[2] J. Bergstra, N. Casagrande, D. Erhan, D. Eck and B. Kegl: “Aggregate features and AdaBoost for music classification,” Machine Learning, 65(2-3):473-484, 2006.
[3] J. Bergstra, M. Mandel and D. Eck: “Scalable genre and tag prediction using spectral covariance,” Proceedings of the 11th International Conference on Music Information Retrieval (IS-MIR), 2010.
[4] T. Bertin-Mahieux, D. Ellis, B. Whitman and P. Lamere: “The million song dataset,” Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR), 2011.
[5] Chih-Chung Chang and Chih-Jen Lin: “LIBSVM: a li¬brary for support vector machines,” software available at http://www.csie.ntu.edu.tw/cjlin/libsvm, 2001.
[6] K. Chang, J. Jang and C. Iliopoulos: “Music genre classifica¬tion via compressive sampling,” Proceedings of the 11th Inter¬national Conference on Music Information Retrieval (ISMIR), pages 387-392, 2010.
[7] S.S. Chen, D.L. Donoho, and M.A. Saunders: “Atomic De-composition by Basis Pursuit,” SIAM Journal on Scientific Computing, 20(1):33-61, 1999
[8] A. Courville, J. Bergstra and Y. Bengio: “A Spike and Slab restricted Bolztmann Machine” Journal of Machine Learning Research, W&CP 15, 2011.
[9] I. Daubechies, M. Defrise and C. De Mol : “An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint,” Comm. on Pure and Applied Mathemat¬ics, 57:1413-1457, 2004.
[10] D.L. Donoho and M. Elad: “Optimally sparse representa-tion in general (nonorthogonal) dictionaries via L1 minimiza¬tion,”. Proceedings of the National Academy of Sciences, USA , 199(5):2197-2202, 2003 Mar 4.
[11] K. Gregor and Y. LeCun: “Learning Fast Approximations of Sparse Coding,” Proceedings of the 27th International Confer¬ence on Machine Learning, Haifa, Israel, 2010.
[12] P. Hamel and D. Eck: “Learning Features from Music Au-dio with Deep Belief Networks,” 11th International Society for Music Information Retrieval Conference (ISMIR 2010). .
[13] P. Herrera-Boyer, G. Peeters and S. Dubnov: “Automatic Clas¬sification of Musical Instrument Sounds,” Journal of New Mu¬sic Research, vol. 32, num 1, March 2003.
[14] K. Kavukcuoglu, M.A. Ranzato, Y. LeCun: “Fast Inference in Sparse Coding Algorithms with Applications to Object Recog¬nition,” Computational and Biological Learning Laboratory, Technical Report, CBLL-TR-2008-12-01,
[15] Y. LeCun, K. Kavukvuoglu and C. Farabet: “Convolutional Networks and Applications in Vision,” Proc. International Symposium on Circuits and Systems (ISCAS’10), IEEE, 2010.
[16] H. Lee, Y. Largman, P. Pham, and A. Ng: “Unsupervised fea¬ture learning for audioclassification using convolutional deep belief networks,” Advances in Neural Information Processing Systems (NIPS) 22, 2009.
[17] H. Lee, A. Battle, R. Raina and A.Y. Ng: “Efficient Sparse Coding Algorithms,” Advances in Neural Information Pro-cessing Systems (NIPS), 2006.
[18] T. Li and G. Tzanetakis: “Factors in automatic musical genre classification,” IEEE Workshop on Applications of Signal Pro¬cessing to Audio and Acoustics, New Paltz, New York, 2003.
[19] T. Li, M. Ogihara and Q. Li: “A comparative study on content-based music genre classification,” Proceedings of the 26th An¬nual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’03), 2003.
[20] Y. Li and S. Osher: “Coordinate descent optimization for l1 minimization with application to compressed sensing; a greedy algorithm,” Inverse Problems and Imaging, 3(3):487¬503, 2009.
[21] S. Mallat and Z. Zhang: “Matching Pursuits with time-frequency dictionaries,” IEEE Transactions on Signal Process¬ing, 41(12):3397:3415, 1993.
[22] P.-A. Manzagol, T. Bertin-Mahieux and D. Eck: “On the use of sparse time relative auditory codes for music,” . In Proceed¬ings of the 9th International Conference on Music Information Retrieval (ISMIR), 2008.
[23] M. Nourouzi, M. Ranjbar and G. Mori: “Stacks of Convolu-tional Restricted Boltzmann Machines for Shift-Invariant Fea¬ture Learning,” IEEE Computer Vision and Pattern Recogni¬tion (CVPR), 2009.
[24] B. Olshausen and D. Field: “Emergence of simple-cell recep¬tive field properties by learning a sparse code for natural im¬ages,” Nature, 1996.
[25] Y. Panagakis, C. Kotropoulos, and G.R. Arce: “Music genre classification using locality preserving non-negative tensor factorization and sparse representations,” Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR), pages 249-254, 2009.
[26] G. Peeters: “A large set of audio features for sound description (similarity and classification) in the cuidado project”. Techni¬cal Report, IRCAM, 2004.
[27] C. Schoerkhuber and A. Klapuri: “Constant-Q transform tool¬box for music processing,” 7th Sound and Music Computing Conference, Barcelona, Spain, 2010.
[28] E. Smith and M. Lewicki: “Efficient Auditory Coding” Nature, 2006.
[29] G. Tzanetakis and P. Cook: “Musical Genre Classification of Audio Signals,” IEEE Transactions on Speech and Audio Pro¬cessing, 10(5):293-302, 2002.
[30] Jianchao Yang, Kai Yu, Yihong Gong, Thomas Huang: “Lin¬ear Spatial Pyramid Matching Using Sparse Coding for Image Classification,” IEEE Conference on Computer Vision and Pat¬tern Recognition (CVPR), 2009.
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
PRE-PROCESSING TEXT MINING PADA DATA TWITTER
Siti Mujilahwati1
1Program Studi Teknik Informatika, Fakultas Teknik, Universitas Islam Lamongan
Jl, Veteran No.53 A Lamongan
Telp (0322)324706
E-mail: moedjee@gmail.com
ABSTRAK
Pertumbuhan sosial media yang sangat pesat tidak membuat twitter ditinggal oleh penggunanya. Twitter merupakan sebuah sosial media yang dimanfaatkan oleh penggunanya untuk berbagi informasi. Tidak banyak karakter yang dapat dimasukkan pada komentar di twitter. Keterbatasan karakter tersebut membuat para peneliti memakai data tersebut untuk penelitiannya. Komentar ditwitter mengandung banyak ragam type data dan beragam gaya bahasa. Oleh sebab itu diperlukan penanganan khusus pada data komentar dari twitter. Penelitian kali ini akan membahas teknik penanganan data preprocessing data komentar dari twitter. Untuk mengetahui hasil teknik preprocessing yang dihasilkan maka pada penelitian ini akan di ujikan untuk proses klasifikasi layanan sebuah perusahaan telekomunikasi dan didapatkan hasil akurasi mencapai 93,11%.
Kata Kunci: Text Mining, Data Mining, Pre-processing, Twitter
ABSTRACT
Growth social media is very rapid does not make twitter left by users. Twitter is a social media used by users to share information. Not a lot of characters that can be inserted in the comments on twitter. The character makes the researchers used these data for research. Comment on twitter contains a wide variety of data types and diverse style. It therefore requires special handling of the data comments from twitter. The present study will discuss data handling techniques of data preprocessing comments from twitter. To find out the results generated preprocessing techniques, this research will test to the classification of services a telecommunications company until 93,11 % accuracy rate is achieved.
Keyword : Text Mining, Data Mining, Pre-processing, Twitter
1. PENDAHULUAN
1.1. Latar Belakang
Melihat pola hidup manusia saat ini lebih cenderung dengan kehidupan dunia maya, aktifitas sehari-hari yang tidak lepas dari internet. Baik untuk bekerja, usaha, belajar dan juga untuk bersosialisasi sesama teman. Hal tersebut mengakibatkan banyaknya bermunculan sebuah situs yang dinamakan sosial media, salah satunya adalah twitter. Twitter mengalami pertumbuhan yang pesat dan dengan cepat meraih popularitas di seluruh dunia. Hingga bulan Januari 2013, terdapat lebih dari 500 juta pengguna terdaftar di twitter, 200 juta diantaranya adalah pengguna aktif. Pertambahan penggunaan twitter umumnya berlangsung saat terjadinya peristiwa-peristiwa populer. Pada awal 2013, pengguna twitter mengirimkan lebih dari 340 juta komentar (tweet) per hari, dan twitter menangani lebih dari 1,6 miliar permintaan pencarian per hari. Twitter memiliki tingkat pertumbuhan pengguna bulanan sebesar 40 persen. Data tersebut membuat minat para peneliti untuk memanfaatkan data komentar (tweet) dan melakukan teknik mining terhadap data tersebut (Alexander,2013). Baik untuk analisis, klasifikasi ataupun juga asosiasi. Pada disiplin ilmu hal tersebut termasuk kategori text mining. Karena komentar pada twitter mengandung beragam jenis data seperti
text, angka, emoticon, hastag, mention dan lain-lain menjadikan komentar tersebut memiliki tipe yang komplek (Apoorv, dkk. 2011) Dari uraian tersebut maka diperlukan adanya penanganan yang ekstra pada saat tahap pre-processing atau tahap persiapan data. Pada penelitian kali ini akan membahas beberapa teknik penanganan data komentar dari twitter untuk proses data mining.
1.2. Metode Penelitian
Tahap pre-processing atau praproses data merupakan proses untuk mempersiapkan data mentah sebelum dilakukan proses lain. Pada umumnya, praproses data dilakukan dengan cara mengeliminasi data yang tidak sesuai atau mengubah data menjadi bentuk yang lebih mudah diproses oleh sistem. Praproses sangat penting dalam melakukan analisis sentimen, terutama untuk media sosial yang sebagian besar berisi kata-kata atau kalimat yang tidak formal dan tidak terstruktur serta memiliki noise yang besar. Ada tiga model praproses untuk kalimat atau teks dengan noise yang besar (A Clark, 2003). Tiga model tersebut adalah : 1. Orthographic Model. Model ini dipergunakan
untuk memperbaiki kata atau kalimat yang
memiliki kesalahan dari segi bentuk kata atau
kalimat. Contoh kesalahan yang diperbaiki
49
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
dengan Orthographic model adalah huruf kapital di tengah kata.
2. Error Model. Model ini dipergunakan untuk memperbaiki kesalahan dari segi kesalahan eja atau kesalahan penulisan. Ada dua jenis kesalahan yang dikoreksi dengan model ini yaitu kesalahan penulisan dan kesalahan eja. Kesalahan penulisan mengacu pada kesalahan pengetikan sedangkan kesalahan eja muncul ketika penulis tidak tahu ejaannya benar atau salah.
3. White Space Model. Model ke tiga ini mengacu pada pengoreksian tanda baca. Contoh kesalahan untuk model ini adalah tidak menggunakan tanda titik ‘.’ di akhir kalimat. Namun, model ini tidak terlalu signifikan, terutama ketika berhadapan dengan media sosial yang jarang mengindahkan tanda baca.
Rangkaian dari penelitian ini adalah melakukan ekstraksi data menjadi data yang siap untuk digunakan teknik mining. Tahap praproses data ini dapat kita sebut sebagai ektraksi data. Alur dari penelitian ini dapat ditunjukkan pada Gambar 1. Pertama yang dilakukan adalah pengambilan data dari twitter secara otomatis dan disimpan dalam database. Sesuai dengan tujuan dari teknik mining yang akan dilakukan, misalkan pada kasus ini yang nanti hasilnya akan dipakai untuk teknik klasifikasi maka data mentah sebelum dilakukan tahap praproses terlebih dahulu harus dilabeli secara manual untuk menentukan kelas setiap masing-masing komentar. Paling penting data yang diambil adalah data komentar (tweet) berdasarkan topik yang diinginkan. Selanjutnya data yang sudah tersimpan pada database akan dilakukan ektraksi data, hasil ektraksi atau praproses akan dilakukan pengujian untuk kasus klasifikasi.
Merujuk pada penelitian sebelumnya yang dilakukan oleh Himalatha (Himalatha, dkk, 2012) maka pada penelitian ini akan dibahas beberapa proses ektraksi data antara lain case folding, remove punctuation, remove username, remove hashtag, clean number, clean one char, remove url, remove RT, convert numberdan remove number.
1. Case Folding, bertujuan membuat semua text menjadi huruf kecil.
2. Remove Punctuation. Bertujuan menghapus semua karakter non alphabet misalnya simbol, spasi dan lain-lain.
3. Remove Username. Bertujuan menghapus nama user biasanya diawali dengan simbol “@” karena dalam suatu kasus dapat dianggap tidak penting maka perlu dihilangkan, apabila dibutuhkan maka proses ini tidak perlu dilakukan.
4. Remove Hashtag. Hashtag hanyalah suatu penunjuk sebuah kata yang dibicarakan oleh sesama pengguna twitter yang memiliki simbol “#”. Biasanya akan digunakan sebagai judul topik pembicaraan dan juga berfungsi sebagai pengelompokan terhadap percakapan yang berhubungan dengan kata yang diberi simbol hashtag. Proses ini juga dapat dikategorikan antara penting dan tidak penting, dapat dilakukan ataupun tidak dilakukan proses Remove Hashtag.
5. Clean Number. Berfungsi untuk menghapus angka yang selalu ada di depan dan di belakang kata. Meskipun dalam penulisan komentar selalu menyertakan sebuah angka di setiap awal atau akhir kalimat untuk menunjukkan bahwa kalimat tersebut diulang-ulang maka dalam bahasa Indonesia yang baik itu merupakan hal yang salah. Begitu juga pada sebuah penelitian, apabila menemukan sebuah kata yang menggunakan tambahan angka maka perlu dihapus. Contohnya hujan2 maksudnya hujan-hujan, i2 maksudnya itu.
6. Clean One Character. Berfungsi menghapus jika terdapat hanya satu huruf saja, karena tidak mengandung arti. Seringnya muncul sebuah huruf pada komentar twitter membuat sebuah hasil data ektraksi yang banyak dan tidak baik. Satu huruf yang dimaksud adalah sebagai contoh y, g, k dan lain sebagainya. Walaupun maksud dari penulis komentar bahwa y adalah ya, g adalah tidak, k adalah kok. Maka untuk proses ekstraksi data itu merupakan sebuah kata yang tidak mudah dideklarasikan karena tidak memiliki arti yang jelas.
7. Removal URL. Seringnya muncul sebuah url dari data twitter membuat data tidak efektif dan tidak memiliki arti. Untuk itu perlu adanya penghapusan url tersebut. Kemunculan alamat
50
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
web atau url ini disebabkan karena banyaknya user mempromosikan sebuah produk pada situs mereka supaya user yang lain langsung bisa masuk pada halaman web yang dimaksud.
8. Remove RT. Pada twitter untuk menunjuk atau mengajak teman berkomunikasi langsung adalah dengan menambahkan simbol “@” sebelum user name yang dituju. Pada suatu penelitian tidak memperhatikan sebuah nama user dan banyaknya user yang komentar. Peneliti hanya memanfaatkan data atau komentar user tersebut, untuk itu perlu dihapus.
9. Convert Number. Seringnya pemakaian bahasa gaul pada twitter melibatkan angka menjadi variasi dalam menulis seperti “s4y4n9” dan lainnya. Dalam Bahasa Indonesia yang baik kata “s4y4n9” tidak memiliki makna, padahal maksud dari kata tersebut adalah sayang. Untuk itu perlu adanya proses convert number untuk mengkonversi angka menjadi huruf.
Sebelum melakukan konversi nomor maka perlu dideskripsikan perubahan yang diinginkan. Pemodelan ini sebenarnya ada keuntungan dan kerugian. Apabila penelitian pada kasus layanan sinyal atau berhubungan dengan produk operator maka bisa saja proses ektraksi ini tidak dilakukan. Karena dapat merubah arti sebuah kata pada komentar. Seperti sinyal 3G, apabila dilakukan convert number maka angka 3 akan dihapus dan untuk huruf G bisa saja dilakukan proses selanjutnya yaitu proses convert word. Dalam penelitian ini perubahan convert number yang dipakai datanya dapat direpresentasikan seperti pada Tabel 1.
Tabel 1 Konversi Angka ke Huruf
No Angka Huruf
1 1 i
2 3 e
3 4 a
4 5 s
6 6 dan 9 g
7 7 t
8 8 b
Konversi angka ke huruf pada penelitian ini hanya menggunakan data seperti pada Tabel 1, angka 1 diganti dengan huruf i, angka 3 diganti dengan huruf e dan angka 4 diganti dengan huruf a. Angka lima diganti dengan huruf s angka 6 dan 9 diganti dengan huruf g. untuk angka 7 diganti dengan huruf t dan angka 8 diganti huruf b.
10. Remove Stop Word. Stop word diproses pada sebuah kalimat jika mengandung kata-kata yang sering keluar dan di anggap tidak penting
seperti waktu, penghubung, dan lain sebagainya (Vijayarani). Untuk itu perlu dilakukan penghapusan. Untuk melakukan proses penghapusan kata ini diperlukan sebuah data atau daftar kata yang diinginkan untuk dihapus.
Tabel 2 Data untuk Stop Word
#Kata hubung #
Waktu # Kata
tanya
dengan senin apa
di selasa bagaimana
karena rabu dimana
ke kamis kapan
is jumat mengapa
yang sabtu siapa
jika minggu
bagi januari
akan februari
sebagai maret
seperti april
kalau mei
11. Remove Negation Word. Untuk negation word sebenarnya prosesnya tidaklah menghapus kata melainkan diambil untuk menilai bahwa kalimat yang diproses mengandung kalimat negatif. Selanjutnya akan ditambahkan ke sebuah variabel yang sudah ditentukan untuk dihitung. Misalnya kasus sentimen analisis yang membutuhkan penilaian pada kalimat positif dan negatif.
Sama dengan penggunaan fungsi penghapusan kata stop word, pada fungsi penghapusan negation word ini juga menggunakan sebuah file path berupa file text sebagai penyimpan data yang dikoleksi seperti pada Tabel 3.
Tabel 3 Daftar Kata Negation Word
No Kata
1 Gak
2 ga
3 bkn
4 bukan
5 enggak
6 g
7 jangan
8 nggak
9 tak
10 tdk
11 tidak
12. Convert Word. Pentingnya convert word adalah
untuk mengkonversi kalimat yang tidak baku, saat ini penggunaan kalimat alay atau bahasa gaul mengakibatkan penggunaan Bahasa Indonesia tidak baku.
51
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
Tabel 4 Contoh Daftar Kata untuk Convert
Word
Sebelum Sesudah
Akyu Aku
akuwh Aku
akku Aku
aq Aku
aquwh Aku
awak Aku
amaca Ahmasak
alluw Hallo
atw Atau
bb Blackberry
bwt Buat
bs Bisa
bsa Bisa
bli Beli
binun Bingung
btw Ngomong-ngomong
bnerin Benerin
bapuk Jelek
bnr Benar
cemungud Semangat
ciyus Serius
cuxin Cuekin
coz Sebab
cz Karena
cay Saying
cayank Saying
dmn Dimana
ett Add
enelan Beneran
engga Enggak
eank Yang
fren Teman
gantii Ganti
gantiii Ganti
gnt Ganti
gmn Gimana
gni Gini
grtis Gratis
gituu Begitu
Hhumz Rumah
13. Convert Emoticon. Seringnya ekpresi diungkapkan dengan sebuah gambar atau simbol emoticon dalam twitter menyebabkan perlu adanya pengkonversian ke dalam bentuk string yang dapat diartikan maknanya.
Fungsi untuk mengkonversi simbol emoticon ini hampir sama dengan fungsi melakukan convert negation dan convert word. Hanya saja isi atau koleksi data yang dipakai yang berbeda. Dalam penelitian ini digunakan tiga ekspresi pada symbol emoticon (Read, 2005; Go et al., 2009). Seperti pada Tabel 5 Daftar emoticon yang dipakai.
Tabel 5 Daftar Emoticon
Emoticon Konversi Masuk
kelas
>:] :-) :) :o) :]
:3 :c) :> =] 8)
=) :} :^) Senang Positif
>:D :-D :D 8-
D 8D x-D xD
XD XD =-D
=D =-3 =3 Tertawa Positif
>:[ :-( :( :-c :c
:-< :< :-[ :[ :{
> .><. <> .< :’( Sedih Negative
D :< D : D 8 D
; D = D X v.v
D-': Horror Netral
>:P :-P :P X-P
x-p xp XP :- p
:p =p :-Þ :Þ :-b
:b Tongue Netral
>:o>:O :-O :O
°o° °O° :O o_O
o.O 8-0 Shock Positif
>:\ >:/ :-/ :-. :/
:\ =/ =\ :S Kesal Negative
:| :-| Ekspesi
Datar Negative
2. PEMBAHASAN
Melakukan tahap praproses dinilai sangat penting dalam teknik data mining, terutama pada data yang bersumber dari sosial media yang berupa sebuah text. Untuk menilai dari hasil penelitian ini peneliti akan membahas beberapa point berikut diantaranya adalah pre-processing dan uji coba untuk kasus klasifikasi.
2.1 Pre Processing (Extraksi Data)
Aplikasi Pre-processing ini dibangun dengan menggunakan bahasa pemprograman java. Pada point ini akan dibahas beberapa formula (code) yang digunakan dalam penelitian ini.
1. Case Folding
1. public String foldCase(String myString) {
2. return myString.toLowerCase();}
2. Remove Punctuation
1. public String
removePunctuation(String myString) {
2. String myPattern = "[^A-Za-z0-9\\s]+";
3. String newString =
myString.replaceAll(myPattern, "");
4. return newString}
3. Remove User Name
52
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
1. public String removeUsers(String myString) {
2. Extractor myExtractor = new Extractor();
3. List<String> myUsers =
myExtractor.extractMentionedScreenna mes(myString);
4. String myResult =
this.removeWords(myUsers,myString,”@ ");
5. return myResult;}
4. Remove Hashtag
1. public String removeHashtags(String myString) {
2. Extractor myExtractor = new Extractor();
3. List<String> myHashtags =
4. myExtractor.extractHashtags(myString
);5. String myResult =
this.removeWords(myHashtags,
myString, "#");
6. return myResult;}
5. Clean Number
1. public String cleanNumber(String myString) {
2. String myPatternEnd = "([0
9]+)(\\s|$)";
3. String myPatternBegin = "(^|\\s)([0¬9]+)";
4. myString =
myString.replaceAll(myPatternBegin,
"$1");
5. myString =
myString.replaceAll(myPatternEnd,
"$2");
6. return myString;}
6. Clean One Character
1. public StringremoveSingleChar(String myString) {
2. String newString = "";
3. String[] listWords = yString.split(" ");
4. for (String myWord : listWords) {
5. if (myWord.length() > 1) {
6. if (newString.length() != 0) {
7. newString += " " + myWord;
8. } else {
9. newString = myWord;}}}
10. return newString;}
7. Removal URL
1. public String removeURLs(String
myString) {
2. Extractor myExtractor = new
Extractor();
3. List<String> myURLs =
myExtractor.extractURLs(myString);
4. String myResult =
this.removeWords(myURLs, myString);
5. return myResult; }
8. Remove RT
1. public String removeRT(String myString) {
2. String myPattern = "(\\s)(RT)(\\s)";
3. String newString =
myString.replaceAll(myPattern, "$1");
4. return newString; }
9. Convert Number
1. public String convertNumber(String myString) {
2. myString = myString.replace("00", "u");
3. Iterator myIterator =
numberMap.entrySet().iterator();
4. while (myIterator.hasNext()) {
5. Map.Entry myPair = (Map.Entry) myIterator.next();
6. String myKey = (String) myPair.getKey();
7. String myValue = (String) myPair.getValue();
8. myString =
myString.replaceAll(myKey, myValue); }
9. return myString; }
10. Remove Stopword
1. public String removeStopWords(String myString) {
2. for (String myStopWord : this.stopWordsList) {
3. String myPattern = "(^|\\s)(" + myStopWord + ")(\\s|$)";
4. myString =
myString.replaceAll(myPattern, " "); }
5. return myString; }
11. Remove Negation word
1. public String
removeNegationWords(String myString) {
2. for (String myStopWord : this.negationWordsList) {
3. String myPattern = "(^|\\s)(" + myStopWord + ")(\\s|$)";
4. myString =
myString.replaceAll(myPattern, " "); }
5. return myString; }
12. Convert Word
1. public String convertWords(String myString) {
2. terator myIterator =
this.convertWordMap.entrySet().itera tor();
3. while (myIterator.hasNext()) {
4. Map.Entry myPair = (Map.Entry) myIterator.next();
5. String myKey = (String) myPair.getKey();
6. String myValue = (String) myPair.getValue();
7. myString =
this.convertWord(myString, myKey, myValue);}
8. return myString;}
13. Convert Emoticon
1. public String
convertEmoticons(String myString) {
53
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
2. Iterator myIterator =
this.emoticonsMap.entrySet().iterato r();
3. while (myIterator.hasNext()) {
4. Map.Entry myPair = (Map.Entry) myIterator.next();
5. String myKey = (String) myPair.getKey();
6. String myValue = (String) myPair.getValue();
7. myString =
this.convertWord(myString,Pattern.qu otemyKey), myValue);}
8. return myString; }
Hasil implementasi pada desain user interface adalah seperti nampak pada Gambar 2.
Gambar 2 Implementasi Code Preprocessing
Data Twitter dengan Java
Pada Gambar 2 dapat dijelaskan bahwa setiap tahap proses dapat dipilih sesuai dengan kebutuhan, untuk itu di desain dengan model pilihan ceklist, sehingga pengguna dapat menyesuaikan topik penelitian mana saja proses yang dapat digunakan. Sedangkan untuk daftar stopward, Negation, Conver Word dan Convert Emoticon digunakan model load. File yang disimpan berupa file text, sehingga apabila user menghendaki penambahan dan pengurangan data list nya dapat menambahkan langsung ke dalam file text. Hasil yang diperoleh dari sistem pre-processing yang dibuat adalah langsung berupa hasil data yang disebut dengan data latih. Akan tetapi pada sistem ini dibuat sebuah log untuk mendokumentasikan setiap proses yang dilakukan pada proses pre-processing. Pada Gambar 3 akan diberikan contoh hasil log proses case folding.
Gambar 3 Hasil Log Proses Case Folding
Pada Gambar 3 dapat dijelaskan bahwa desain untuk hasil prosesnya adalah terdiri dari dua baris, baris pertama adalah teks asli dan baris ke dua adalah hasil case folding. Pada hasil log tersebut sebenarnya disertakan tanggal dilakukannya proses tersebut. Masing-masing proses yang ada pada pre¬processing memiliki satu log berupa text file.
Hasil uji coba untuk semua proses pada tahap pre-processing yang dilakukan dapat digambarkan pada Tabel 6.
Tabel 6 Hasil Tahap Pre-Processin
Proses Kata asli Hasil Ekstraksi
Casefolding SurabayaPOS Surabayapos
Remove punctuation Ka:bar*ba!k Kabarbak
Remove Username @sandiwahono (hilang)
Remove Hashtag #makanan (hilang)
Clean
Number Teknik12 Teknik
Clean
OneChar G makan Makan
RemoveUrl http://stts.edu
(hilang)
RemoveRT iya mel RT
@moedjee Iya
mel@moedjee
Convert Numbers M4m4 Mama
Remove Number Hanya 5 jam Hanya jam
Remove stopword Mau ke
kampus Mau
kampus
Remove Negationword Tidak tahu Tahu
Convertword Aq lagi tlp Aku
lagi telepon
Convert Emoticon O Pos
2.2 Pengujian Hasil Pre-Processing untuk Kasus Klasifikasi
Hasil dataset yang diperoleh dari tahap pre-processing selanjutnya akan diuji coba untuk proses klasifikasi layanan produk suatu perusahaan telekomunikasi. Dengan menggunakan data dari twitter sebanyak 680 record untuk data latih dan 450 record untuk data uji. Contoh data yang digunakan yang berasal dari komentar di twitter dapat dilihat pada Tabel 7. (Mujilahwati, 2015)
Tabel 7 Contoh Data Komentar dari Twitter
No Komentar
1 serasa mati tanpa internet dalam
beberapa hari... :'( gara gara indosat @indosat fuck*
54
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
2 Hadeuhhh...bebenya udh benerr .skrg
providernya yg error
@indosat#mentari
3 SinyaL indosat gila' naik turun trus @indosat
4 @XLCare sinyal bb cm GPRS doang
nih,,, ada apa ya ?
5 2tahun pake simpati bb gue sinyal 3G terus tapi belakangan kenapa sekarang EDGE mulu gan @Telkomsel -_
6 Sapu-sapu dada saja sama jaringannya telkomsel ini O
7 Mending INDOSAT ayeuna mah beli
paket 25ribu dapat 2GB
cobaTelkomsel 100ribu lebih pelit
Dengan menggunakan teknik pre-processing yang telah dibuat data komentar tersebut akan berubah menjadi dataset seperti pada Tabel 8.
Tabel 8 Hasil Ekstraksi (Pre-Processin
No Komentar
1 Serasa mati tanpa internet dalam
beberapa hari....sedih gara gara
indosat indosat fuck
2 Hadeuhhh...bebenya udh
benerr..skrg providernya yang error indosat mentari
3 Sinyal indosat gila naik turun trus indosat
4 Xlcare sinyal bb cm gprs doing nih....ada apa ya?
5 Tahun pake simpati sinyal terus
belakangan kenapa sekarang edge mulu gan telkomsel
6 Sapu sapu dada saja sama
jaringannya telkomsel ini (ekspresi datar)
7 Mending indosat ayeuna mah beli paket ribu dapat gb coba telkomsel ribu lebih pelit
Dari yang digunakan pada uji coba ini dapat di gambarkan dengan sebuah tabel matrix sperti ditunjukkan pada Tabel 9.
Tabel 9 Kebutuhan pada Data Uji
No Kelas Jumlah
Data
Training Jumlah Data Uji
1 Sinyal 188 158
2 Tarif 112 45
3 Internet 196 130
4 Android 22 10
5 Blackberry 97 47
6 Other 65 60
Algoritma yang dipakai untuk uji coba ini adalah dengan menggunakan algoritma Naïve Bayes. Didapatkan hasil klasifikasi dari data tersebut seperti digambarkan pada matrik Tabel 10.
Tabel 10 Hasil Klasifikasi Uji Coba
Prediksi
Fakta Kelas Sinyal Tarif Internet Android Blackberry Other
Sinyal 150 1 2 1 0 4
Tarif 0 43 2 0 0 0
Internet 5 1 124 0 0 0
Android 0 0 0 10 0 0
Black berry 0 2 0 0 45 0
Other 3 4 1 2 3 47
Tingkat akurasi yang diperoleh dari hasil klasifikasi per kategori pada kelas layanan adalah sebagai berikut.
1. Kelas sinyal = 94.93%.
2. Kelas tarif = 95.55%.
3. Kelas internet = 95.38%.
4. Kelas android = 100%.
5. Kelas blackberry = 100%.
6. Kelas other = 78.33%.
Dari 450 data uji yang dipakai masing-masing kategori layanan mendapatkan nilai presentasi keakuratan yang sangat baik. Apabila dihitung dari nilai keseluruhan pada kelas layanan dari 450 record data uji dan dari data latih yang sudah melalui tahap pre-processing, algoritma Naïve Bayes dapat mengklasifikasikan sebanyak 419 record. Sehingga hasil nilai presentase akurasi untuk klasifiasi kategori layanan adalah 93.11%.
3. KESIMPULAN
Setiap proses pada tahap pre-processing data text dari twitter yang telah dibuat, memiliki hasil yang sangat baik, sehingga dapat digunakan untuk penelitian lebih lanjut tentang data mining atau text mining.
Penelitian ini tidak membahas proses stemming dan terbukti hasil pre-processing yang lakukan untuk uji coba klasifikasi mendapatkan hasil yang baik hingga mencapai tingkat akurasi 93.11%. Hasil uji dari kasus klasifikasi ini dianggap tidak maksimal dikarenakan adanya data mention terhadap customer support dan dilakukannya remove usename dan remove hashtag.
Walaupun tanpa menggunakan proses stemming hasil yang didapatkan sudah sangat baik, proses stemming dapat ditambahkan untuk mendapatkan hasil dataset yang lebih efisien, lebih ringkas dan
55
Seminar Nasional Teknologi Informasi dan Komunikasi 2016 (SENTIKA 2016) ISSN: 2089-9815
Yogyakarta, 18-19 Maret 2016
akan berpengaruh pada proses data mining yang tentunya prosesnya akan lebih cepat.
4. PUSTAKA
Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, Rebecca Passonneau, (2011), Sentiment Analysis Of Twitter Data. Department of Computer Science, Columbia University, New York, USA
Clark, A. (2003). Pre-processing Very Noisy Text. Proceedings of Workshop on Shallow Processing of Large Corpora (pp. 12-22). Lancaster: Lancaster University.
Hemalatha, P Saradhi Varma, G. Govardhan, A. Preprocessing the Informal Text for efficient Sentiment Analysis. International Journal of Emerging Trends & Technology in Computer Science (IJETTCS), vol 1 Issue 2,July– August 2012
Jonathon Read, (2005). Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In ACL. The Association for Computer Linguistics.
Mujilahwati. Siti, (2015), Klasifikasi dan sentiment analysis dari twitter untuk komentar pada penyedia jasa seluler di Indonesia. Makalah disajikan dalam Seminar Nasional pengembangan actual teknologi informasi. Proceding. UPN ”Veteran”, Surabaya, Jawa Timur, 02 Desember.
Pak. Alexander dan Paroubek. Patrick, (2013). Twitter as a Corpus for Sentiment Analysis and Opinion Mining. Laboratoire LIMSI-CNRS, Bˆatiment 508, Universit´e de Paris-Sud, France
Vijayarani, S., Ilamathi, J.,Nithya. Preprocessing Techniques for Text Mining - An Overview. International Journal of Computer Science & Communication Networks,Vol 5(1),7-16
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-39
ECONOMIC PROBLEMS OF REAL ESTATE MARKET AND ITS INFLUENCE
ON THE DEVELOPMENT OF BUSINESS ENVIRONMENT
Linda Kauskale1, Mg.oec, PhD student, Ineta Geipele2, prof., Dr. oec.
1, 2Institute of Civil Engineering and Real Estate Economics, Faculty of Engineering Economics and
Management, Riga Technical University
Topicality of the research is determined by the fact that economic problems are affecting the development of the business environment and all society in general. The aim of the research is to analyze key economic problems, while paying special attention to business cycles and their importance for the development of business environment and entrepreneurship in real estate market, which is also an object of the research. Economic problems in general and problems of real estate market development have a great influence on the development of society, and at the same time, all these aspects have a great importance in business environment development as well. Tasks of the research are the following: to conduct a detailed analysis of scientific literature on economic development problems; to determine key influences on the real estate market development and development of business environment; to analyze related social aspects of researched question; to summarize results of the research and to develop recommendations for improvement of business environment.
Hypothesis of the research is – the resolution of economic problems in the existing economic system is crucial for the development of business
environment and has an impact on the entrepreneurship in real estate market. The paper employs a detailed analysis of the scientific literature, research articles and international experience on research problem. The research induction, deduction, logical access and historical approach methods have been used in order to achieve the aim of the research. Integrated approach of conducted research gives an opportunity of wider analysis on research question, emphasizes novelty of the research and
is important for development of
recommendations. One of the priorities of the Europe 2020 strategy is a sustainable and inclusive economic development, so the special attention should be paid to its improvement and resolution of the related problems (European Commission, 2015).
1. Socio-economic development problems and business cycle
Development of the society can be observed every day as well as it is possible to identify the areas and problems, which should be improved and resolved on a daily basis. The development history of the society includes several significant stages, and each stage has its pros and cons. The most significant directions of human development over the past 500 years are presented in Table 1.
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 39
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-40
Table 1
The most significant directions of human development over the past 500 years
Year
(approx.) Direction of the Development Typical Features of the Development
1600 Scientific revolution Use of modern science and resources
1750 Enlightenment Development, which is based on science and intellectual awareness
1775 Industrial revolution, urbanization Mass production of various goods; extensive use of fossil fuels
1900 Socialism, democracy Stronger civil society: development
opportunities for each
1920 Marxism and capitalism (in the East and in the West) New economic order
1950 Modernism State and market, economic growth
1980 Neoliberal capitalism Margaret Thatcher, Ronald Reagan,
globalization
2000 We are looking for a new direction of development Sustainable development
Source: Klavins and Zaloksnis, 2010, p.18
Development of society occurs cyclical. There are several types of cycles, they differ in a number of factors, such as duration, intensity. The causes and consequences of the cycles are also different. The following cycles are particularly important for the analysis of the economic problems:
• economic cycles, which are also called business cycles or trade cycles;
• cycles of financial and capital markets;
• real estate market cycles etc.
Arthur F. Burns and Wesley C. Mitchell (1946) define business cycles as “a type of fluctuation found in the aggregate economic activity of nations that organize their work mainly in business enterprises: a cycle consists of expansions occurring at about the same time in many economic activities, followed by similarly general recessions, contractions, and revivals which merge into the expansion phase of the next cycle; this sequence of changes is recurrent but not periodic; in duration business cycles vary from more than one year to ten or twelve years; they are not divisible into shorter cycles of similar character with amplitudes approximating their own”. However, Arthur F. Burns and Wesley C.
Mitchell (1946) criticized their own definition, because the following question was raised by the definition “Does a large nation, such as the United States, have a single set of business cycles, or do the several geographical regions have substantially different cyclical movements?”
David Begg (2013) defines the business cycle as “a short-term fluctuation of output around its trend path”. He considers cycles are likely when we recognize the limits imposed by supply and demand. Christina D. Romer (2008) points that the term “business cycle” is misleading, because the “cycle” seems to have some regularity in the timing and duration of expansion and recessions in economic activity but the most economists have another understanding: there have been three recessions in the USA over the time period between 1973 and 1982 but 1982 was followed by eight years of uninterrupted expansion. Many modern economists prefer using the term “short-run economic fluctuations” rather than “business cycle” in order to describe the changes in economic activity.
Irving Fisher (1933) emphasized that co¬existing cycles could also aggravate or neutralize each other. Ludwig von Mises (1949) defined that “the wavelike movement affecting the economic
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 40
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-41
system, the recurrence of periods of boom which are followed by periods of depression, is the unavoidable outcome of the attempts, repeated again and again, to lower the gross market rate of interest by means of credit expansion”. Ludwig von Mises (1949) also specified, that the one was free to demand an improvement in the quality and increase in the quantity of products but he mentioned that if somebody applied this formulation to various phases of the cyclical fluctuations of business activities, he should have called the boom retrogression and the depression should have been called a progress, because the boom produced impoverishment and moral ravages, and, the more optimistic people were under the illusory prosperity of the boom.
These ideas have a great influence also on the real estate market, because business cycle is affecting the real estate market and in some cases vice versa. Investment Property Databank (1999) notes that “property cycles are recurrent but irregular fluctuations on the rate of all property total return, which are also apparent in many other indicators of property activity but with varying leads and lags against the all-property cycle. Common features of three UK property crashes: 1973-4; 1989-91; 2007-9 (Jowsey, 2011):
• over-optimistic developments supported by borrowed funds;
• lenders borrowed short and lent long without adequate security;
• lenders were exposed to unforeseen shocks:
a. oil price rise in 1973;
b. inflation and subsequent interest rate rise in 1989;
c. US sub-prime crisis in 2007.
The government plays a key role in the real estate market regulation. Kate Barker (2004) in The Barker Review for UK defined the following objectives:
• to achieve improvements in housing affordability in the market sector;
• a more stable housing market;
• location of housing supply which supports patterns of economic development;
• an adequate supply of publicly funded housing for those who need it.
Milton Friedman (1962) in his work Capitalism and Freedom emphasized that many government programmes were not coming into effect until passing of the recession, more often they were affecting total expenditures and were tending to exacerbate the succeeding expansion instead of mitigation of recession. He mentioned that there was often a lack of information in fiscal and monetary policy in order to be able to use changes in taxation or expenditures as sensitive stabilizing mechanism. Macroprudential policy is also very important nowadays, and its aim is the increase of the resilience of the financial system to shocks and moderation of the financial cycle (Committee on the Global Financial System, 2010). Antony Giddens (1984) believes that people in social life are straining after the certain stability, i.e. they have a need for security and confidence. Such economic indicator as income distribution is also very important for the development of society. Simon Kuznets (1955) in his work Economic growth and income inequality believed that economic growth in poor countries increased the income difference between rich and poor people. Simon Kuznets (1930) also analyzed the cyclical nature of production outcomes and prices in time period of fifteen to twenty years, and such trade cycles were often referred to “Kuznets cycles”. Vilfredo Pareto (1961) was famous for his elite theory - he claimed that people were not equal either physically or intellectually as well as from the point of view of the morality, so he believed that inequality was completely natural, obvious and was a real phenomenon. It meant that people with higher indicators were composing elite and each of the spheres of human activity had its own specific elite. According to Vilfredo Pareto (1961), elite was divided into governmental and non
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 41
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-42
governmental political elites, and he stressed that the exchange between the elite and the rest of the society was continuously taking place, and such circulation was necessary in order to ensure social balance.
Human inequality is related to income inequality, which is one of the most important factors, defining the accessibility of housing for different social groups. The countries with the most inaccessible housing are presented in Figure 1.
Source: Demographia, 2015, p.13
Fig.1. National Housing Affordability in 2014 according to data from 378 markets
(Median Multiple - Median House Price Divided by Median Household Income)
Housing affordability is a socio-economic aspect, which is affecting the life quality of society and also has high economic meaning. John Kenneth Galbraith (1958) was convinced that the economic imbalances were resulting also from the consumer society, because too many resources were driven to production of goods but not sufficient resources are driven to the public needs and infrastructure. There are several problems both in the economy and in the real estate market, and the development fluctuations make business environment uncertain, so it is necessary to understand how the entrepreneurs and market participants can reduce the impact of the negative factors.
2. Business environment and real estate market development problems
The main areas where increased levels of
entrepreneurial activity can contribute
significantly to specific policy outcomes are (OECD, 2004):
1) job creation;
2) economic growth, productivity
improvement, and innovation;
3) poverty alleviation and social opportunities.
It is important to note that the company is affected by a number of micro and macro factors. The competitiveness-oriented management micro and macro factors are presented in Figure 2.
Mutual link
Macroenvironment
Microenvironment
P- Political factors
E- Economic indicators The competitiveness- Enterprise
S- Social, cultural demographic factors oriented management
micro- and Product
Consumers
T- Technology, science etc. macroenvironment Mediators
E- Ecological
Competitors et al.
L- Legal
Source: Fedotova, 2012. Figure updated by the authors
Fig.2. The competitiveness-oriented management micro- and macro-environment
All influencing micro- and macro
environmental factors have a significant impact
on business environment in general as well as the operation and competitiveness of any
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 42
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-43
company. Nowadays, external factors become more and more important for strategic management and development of business environment. The previous research (Kauskale and Geipele, 2015) has shown that, for example, in Latvia, in the period from 2004 to 2008, the number of mortgages secured in the Land Registry in Riga was bigger than the registered purchase agreements, which was one of the market overheat signals. After the crisis period, the situation has changed but this fact highlights
the significance of the mortgage system in the country and the availability of credits for real estate purchase on the local market. At the same time, the importance of influence of external and internal factors on the entrepreneurship highlights the Competing Values Framework. Influence of macroeconomic factors is especially important for open systems model, however, this factor influences all entrepreneurs. Competing values framework of Quinn is shown in Figure 3.
Source: Quinn, 1988
Fig.3. Competing values framework of Quinn
One of the studies reveals that environmental competitiveness positively moderates the link between business performance and process innovation but negatively moderates link between business performance and product innovation (Prajogo, 2016). Business environment in the country can also influence volumes of attracted foreign direct investments. For example, the taxation on FDI is quantitatively less relevant than the impact of other policies, as it makes a location attractive to international investors in terms of openness, regulatory hurdles and labour costs (Hajkova, Nicoletti, Vartia and Kwang-Yeol, 2006). OECD (2005) defined four criteria of business environment:
• access to skills;
• access to capital;
• access to opportunities;
• influencing the risk-reward trade-off.
The correlation of entrepreneurial
performance and entrepreneurial education is one of the highest in OECD member countries (2005). Formal and non-formal education of citizens is very important for the development of each region. Formal and non-formal education of citizens of Latvia is presented in Figure 4. It is a positive tendency that the number of people who got education has increased in most Latvian regions in the period from 2004 to 2011.
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 43
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-44
Source: authors’ construction based on the Central Statistical Bureau of Latvia
Fig.4. Formal and non-formal education in Latvia, thou. persons
El-hadj Bah and Lei Fang (2015) analyzed business environment in Sub-Saharan Africa by the following criteria – corruption, crime, infrastructure, regulation and access to finance indicators. Economy regulatory environment in some countries may be more business friendly than in others, for instance, in such areas: starting a business, ease of doing business, dealing with construction permits, getting
electricity, registering property, getting credit, protecting minority investors, paying taxes, trading across borders, enforcing contracts, resolving insolvency (Doing Business, 2015b). Ease of doing business in general also shows a level of attractiveness of business environment. Country ranking of Doing business (2015a) is shown in Table 2.
Table 2
Ranking of countries*
Economy Ease of Doing Business
Rank Starting a
Business Dealing with Construction Permits
Singapore 1 10 1
New Zealand 2 1 3
Denmark 3 29 5
Korea, Rep. 4 23 28
Hong Kong SAR, China 5 4 7
United Kingdom 6 17 23
United States * 7 49 33
Sweden 8 16 19
Norway 9 24 26
Finland 10 33 27
*10 countries were chosen in accordance with Doing Business ranking
Source: authors`construction based on the Doing Business rankings, 2015
Each country has its own competitive advantage and strong and weaker points. External factors and global challenges are affecting entrepreneurship in general as well as the construction business and the development of
real estate market. For example, State Regional Development Agency (2015) focuses attention on economic, social and environmental spheres, and, in the opinion of the authors, influences the development of the business environment.
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 44
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-45
Table 3
Business environment influencing factors
Business environment
Economic sphere Social sphere Environmental sphere
Investments (private non-financial)
Economic activity (economic performance of the company, number of companies, establishment, liquidation)
Infrastructure (physical indicators)
Workplaces
Municipal budget
revenues Demography
Education, qualification, creative
potential
Employment, unemployment
Health (incl. social) of the inhabitants
Security and public order
Social security
Income, consumption, housing sector
Expenses of municipalities for education, social and cultural activities Territory, its structure Resources
Pollution, environmental quality
Water supply, sewerage and waste management (physical indicators)
Source: authors’ construction based on the State Regional Development Agency, 2015
For economic growth calculations, there are attempts to analyse characteristics such as economic diversity, international trade, real income of the population, the level of tax burden, the volume of savings, and economic infrastructure. In the group of indicators of social development, one can find social indicators reflecting income conditions of the population, level of poverty, income differentiation, employment and unemployment, minimum wage, extent of housework and volunteer activities, life expectancy, infant mortality, health status of the population, incidence of obesity, extent of crime, abuse and other factors (Zuzana Hajduova et al., 2014). There are many problems to be resolved in the development of the national economy and of the real estate market. The main problems are grouped and shown in Table 4. Resolution of these problems can positively influence the development of the business environment.
Social and economic problems influence real estate market. Interconnection between national economy, real estate market and construction industry is shown in Figure 5.
Economy, industry development and competitiveness of enterprises are interrelated indicators (Denisovs and Judrupa, 2008). At the same time, economic growth is affecting the real
estate market, and vice versa. Interconnection between government policies, national economy and real estate market is shown in Figure 6.
Purchasing power is affecting the demand in the real estate market but the recession in the real estate market can negatively affect GDP. The volumes of foreign direct real estate investments can increase GDP. The most important difficulty is the lack of information that is affecting the decision-making process of the market participants. Globally, there is a lot of information but the effect of external factors is crucial. The following factors may have an impact on the forecasts – political changes, sociological and cultural changes, economic influences, climate change, technological issues (Harris, McCaffer and Edum-Fotwe, 2013). Structure of national economy and real estate market is complicated and affects the business environment as well as such aspect of entrepreneurship as the company management, so all mentioned aspects are interconnected.
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 45
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-46
Table 4
National economy development, real estate market and construction problems
National economy development
problems Real Estate Market problems Construction
problems
- The cyclical development nature of the real estate market - imbalance.
- Cyclical nature of development need to achieve its sustainable - Significant impact
economic development fluctuations, demand and supply shocks, difficulties in forecasting. development
- Sensitivity of the real estate market and its participants to external of external factors - economic
development and
Development and price stability are factors and economic development real estate market
required, but it is not always possible to achieve. - Real estate market overheat impact
- Real estate market recession, - Investment returns
- Economic growth in terms of GDP growth (which results in the increase in the number of produced goods and services) stagnation
- Financial market development, availability of mortgage loans - Availability of financing
- Construction costs
- Employment support, creation and - Insufficient information - -Environmental
stimulation of new jobs, labour - Real estate return rates aspects, such as
market development, achievement - Real estate market impact on the environmental
of full employment
- Unequal income distribution and need of achievement of effective national economy
- Behaviour of the real estate market participants in the market, short protection,
ecological
construction
use of resources term thinking - Construction quality
- Optimal fiscal, monetary and - Insufficient demand - Compliance with
macroprudential policy deadlines
development and implementation
- Successful regional policy
- Impact of international factors
- Net export, more value-added - Housing affordability problems, that is affected by:
- Purchasing power, salaries
- Availability of mortgage - -Company`s management problems
- - Legal and other
business, positive trade balance - Real estate purchase price PESTEL factors
- Real estate rental rates
- Other PESTEL factors
Source: authors’ construction
National economy -> business environment -> construction and real estate entrepreneurship
Source: authors’ construction
Fig.5. Interconnection between the national economy, business environment,
construction and real estate entrepreneurship
Source: authors’ construction
Fig.6. Interconnection between government policies, national economy development and real estate market
Conclusions, proposals, recommendations
1) National economy and real estate market are of high social importance and face a variety of problems and situations that should be resolved and improved. The results of the research show that the following analyzed
factors are interrelated: cycles,
entrepreneurship, macroeconomics, overall development of the country.
2) Level of attractiveness of business environment is affecting the economic
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 46
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
development of the country as well as social and environmental aspects. Human inequality results in income inequality, which is one of the most important factors for availability of housing for different social groups. Housing affordability is a socio-economic aspect that influences the life quality of the society and is also economically important, so the improvement of economic situation allows increasing the level of prosperity and the quality of life.
Bibliography Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-47 3) Reducing negative impacts of real estate
cycles can be favourable for society in general. Economic fluctuations make business environment unspecified, so it is necessary to reduce the impact of the negative factors. This aspect as well as analysis of business environment development within real estate market and business environment between different regions are planned to be future research directions.
1. Barker, K. (2004). Barker Review of Land Use Planning, Interim Report – Analysis. As cited in Jowsey, E. (2011). Real Estate Economics. New York: Palgrave Macmillan. p 498.
2. Begg, D. K.H. (2013). Foundations of Economics. 5th edition. London: McGraw-Hill Education. p.407
3. Burns, A.F., Mitchell, W.C. (1946). Measuring Business Cycles. NBER. ISBN: 0-870-14085-X. p.590. Retrieved: http://www.nber.org/books/burn46-1 Access: 01.11.2015
4. Caune, J., Dzedons, A. (2009). Strategiska vadisana (Strategic Management). Riga: ,,Lidojosa zivs”, 2nd edition, p. 384.
5. Central Statistical Bureau of Latvia. Statistic database. Retrieved: http://www.csb.gov.lv/dati/statistikas-datubazes-28270.html. Access: 01.12.2015
6. Committee on the Global Financial System (2010). Macroprudential Instruments and Frameworks: a Stocktaking of Issues and Experiences. CGFS Papers No 38. Retrieved http://www.bis.org/publ/cgfs38.pdf Access: 25.12.2015
7. Demographia. (2015). 11th Annual Demographia International Housing Affordability Survey: 2015 . Ratings for Metropolitan Markets. Retrieved: http://www.demographia.com/dhi.pdf Access: 15.11.2015
8. Denisovs, M., Judrupa, I. (2008). Evaluation of Regional Development and Competitiveness. Riga: RTU Publishing House, p. 72.
9. Doing Business (2015a). Economy Rankings. Retrieved: http://www.doingbusiness.org/rankings Access: 20.11.2015
10. Doing Business (2015b). Methodology. Retrieved: http://www.doingbusiness.org/methodology Access: 20.11.2015
11. European Commission (2015). Europe 2020. Retrieved: http://ec.europa.eu/europe2020/index_en.htm Access: 25.11.2015
12. Fedotova, K. (2012). Konkuretspejigas uznemejdarbibas vadisanas nodrosinasana koksnes produktu razosana Latvija (Ensuring Competitive Entrepreneurship Management for Manufacturing Wood Products in Latvia). Riga: RTU Publishing House. p. 46.
13. Fisher, I. (1933). The Debt-Deflation Theory of Great Depressions. Retrieved:
https://fraser.stlouisfed.org/docs/meltzer/fisdeb33.pdf Access: 01.11.2015
14. Friedman, M. (1962). Capitalism and Freedom. Retrieved: http://www.pdf-archive.com/2011/12/28/friedman-milton-capitalism-and-freedom/friedman-milton-capitalism-and-freedom.pdf Access: 14.11.2015
15. Galbraith, J.K. (1958) The Affluent Society. as cited in Vilks, A. (2007). Pasaules Sociologi (World Sociologists). Rīga: ,,Drukatava”. p. 156.
16. Gidddens, A. (1984). The Constitution of Society. Outline of the Theory of Structuration. Retrieved: http://bookfi.net/book/ 1041322 Access: 14.11.2015
17. Hajduova, Z., Andrejovsky, P., Beslerova, S. (2014) Development of Quality of Life Economic Indicators with Regard to the Environment. Procedia - Social and Behavioral Sciences. Volume 110. pp. 47 – 754, doi: 10.1016/j.sbspro.2013.12.919
18. Hajkova, D., Nicoletti, G., Vartia, L. Kwang-Yeol, Y. (2006). Taxation, Business Environment and FDI Location in OECD Countries. Retrieved: http://www.oecd.org/eco/public-finance/37002820.pdf. Access: 09.11.2015
19. Harris, F. McCaffer R., Edum-Fotwe, F. (2013). Modern Construction Management. Seventh edition. UK: Chichester, West Sussex, UK, Wiley-Blackwell. p.377
20. Investment Property Databank (1999). As cited in Evidence of Cycles in European Commercial Real Estate Markets – and Some Hypotheses. Baum, A. Retrieved: http://centaur.reading.ac.uk/27213/1/0500.pdf Access: 12.11.2015
21. Jowsey, E. (2011). Real Estate Economics. New York: Palgrave Macmillan. p 498.
22. Kauskale, L., Geipele, I. (2015). Construction Management - Challenges, Influencing Factors and Importance of Investment Climate. Proceedings of 5th International Conference on Industrial Engineering and Operations Management, United Arab Emirates, Dubai, 3-5 March, 2015. Dubai: IEOM Society, 2015, pp.522-531. ISBN 978-0¬9855497-2-5. ISSN 2169-8767. DOI: 10.1109/IEOM.2015.7093803
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 47
Proceedings of the 2016 International Conference “ECONOMIC SCIENCE FOR RURAL DEVELOPMENT” No 43
Jelgava, LLU ESAF, 21-22 April 2016, pp. 39-48
23. Kuznets, S. (1930). Secular Movements in Production and Prices; their Nature and their Bearing upon Cyclical Fluctuations. Boston: Houghton Mifflin. Retrieved: http://www.worldcat.org/title/secular-movements-in-production-and-prices-their-nature-and-their-bearing-upon-cyclical-fluctuations/oclc/1134102 Access: 15.11.2015
24. Kuznets, S. (1955). Economic Growth and Income Inequality. The American Economic Review. VOLUME XLV (1). Retrieved: https://www.aeaweb.org/aer/top20/45.1.1-28.pdf Access: 14.11.2015
25. Mises, L. (1949). Human Action: A Treatise on Economics. Retrieved:
http://www.econlib.org/library/Mises/HmA/msHmA20.html Access: 10.11.2015
26. OECD (2004). Fostering Entrepreneurship and Firm Creation as a Driver of Growth in a Global
Economy.Retrieved:http://www.brokertecnologico.it/attachments/article/42/Fostering_Entrepreneurship.pdf Access: 15.11.2015
27. OECD (2005). Micro-Policies for Growth and Productivity: Final Report. Retrieved:
http://www.oecd.org/sti/ind/34941809.pdf Access: 15.11.2015
28. Pareto, V. (1961). "The Circulation of Elites," In Talcott Parsons, Theories of Society; Foundations of Modern Sociological Theory, 2 Vol., The Free Press of Glencoe, Inc., 1961. (English translation) pp. 551–557. As cited in Vilks, A. (2007). Pasaules Sociologi [World Sociologists]. Riga: ,,Drukatava”. p. 156.
29. Prajogo, D.I. (2016). The Strategic Fit between Innovation Strategies and Business Environment in Delivering Business Performance. Innovative Service and Manufacturing Design. Volume 171 (2), pp. 241–249. doi:10.1016/j.ijpe.2015.07.037
30. Quinn, R., E. Beyond Rational Management: Mastering Paradoxes and Competing Demands of high effectiveness. San Francisco: Jossey-Bass; 1988. As cited in Morais, Luis F.; Graça, Luis M. (2013). A glance at the competing values framework of Quinn and the Miles & Snow strategic models: Case studies in health organizations. Retrieved: http://www.elsevier.pt/en/revistas/revista-portuguesa-saude-publica-323/artigo/a-glance-at-the-competing-values-framework-of-90259646. Access: 09.11.2015
31. Romer, C.D. (2008). Business Cycles. The Concise Encyclopedia of Economics. Retrieved: http://econlib.org/library/Enc/BusinessCycles.html Access: 10.11.2015
32. State Regional Development Agency of Latvia (2015). Ekonomikas sfera (Economic Sphere). Retrieved: http://www.vraa.gov.lv/lv/publikacijas/statistika/econom/ Access: 01.12.2015
33. Vide un ilgtspejiga attistiba (Environment and Sustainable Development). Eds. Klavins, M. Zaloksnis . Riga : LU Academic Publishing House, 2010. P. 334 lpp
34. Vilks, A. (2007). Pasaules Sociologi (World Sociologist). Riga: ,,Drukatava”. p. 156.
Corresponding author. Tel.: + 371 29963349; fax: + 371 67089034 E-mail address: Linda.Kauskale@rtu.lv 48
COURSES FOR ERASMUS STUDENTS
ELECTRICAL ENGINEERING
Academic Year 2014/2015
List and description of courses
Winter semester
Course code Forms-hours/week (L T Lab P S)* ECTS
1. ELR 021330 Numerical and Optimization Methods (1 0 0 0 0) 22
2. ELR 021331 Power Quality Assessment (2 0 0 0 0) 22
3. ELR 022131 Power System Faults E (2 0 0 1 0) 64,2
4. ELR 022132 Digital Control Systems (2 0 1 0 0) 43,1
5. ELR 023225 Dynamics and Control of AC
and DC Drives E (2 0 0 0 0) 44
6. ESN 001500 Advanced Technology in Electrical
Power Generation (2 1 0 1 0) 53,1,1
7. ELR 021120 Advanced High Voltage Technology E (2 0 0 0 0) 3(3
8. ELR 022135 Artificial Intelligence Techniques (2 0 0 1 0) 32,1
9. ELR 022233 Power System Automation
and Security E (2 0 0 0 1) 43,1
10.ELR 022532 Electrical Power Systems Management (1 0 0 0 1) 21,1
11.ELR 023311 Electromagnetic Compatibility (2 0 0 0 1) 32,1
12.ELR 023312 Advanced Measurement in Electrical
Power Engineering (2 0 0 0 0) 22
13.ELR 023228 Power Electronics (2 0 0 0 0) 33
14.ELR 021337 Photovoltaic Cells E (2 0 0 0 0) 33
15.ELR 021338 Industrial Ecology– Selected Issues (1 0 0 0 1) 21,1
16.ELR 022537 Legal Regulations and Investments in Power Systems with Distributed
Energy Sources (2 0 0 0 1) 32,1
17.ELR023110 Modeling of Electric Machines (1 0 0 2 0) 31,2
18.ELR022334 Energy Storage Systems E (1 0 0 1 0) 32,1
* - ecture, - utorial, ab, P-Pro ect, S-Seminar
Summer semester
Course code Forms-hours/week (L T Lab P S)* ECTS
1. ELR 021332 Selected Problems of Circuit Theory E (2 1 0 0 0) 4 3,1)
2. ELR 022133 Simulation and Analysis of Power
System Transients (1 0 2 0 0) 31,2
3. ELR 022134 Digital Signal Processing
for Protection and Control (2 0 0 2 0) 53,2
4. ELR 022231 Power System Protection (2 0 0 0 0) 33
5. ELR 022232 Fiber Optics Communication
and Sensors (2 0 0 0 0) 22
6. ELR 022331 Renewable Energy Sources (2 0 0 0 1) 32,1
7. ELR 022531 Electric Power System Operation
and Control (2 0 0 0 1) 32,1
8. ELR 022137 Protection and Control of Distributed
Energy Sources E (1 0 0 0 1) 32,1
9. ELR 022332 Water Power Plants (2 0 0 0 1) 32,1
10.ELR 022333 Renewable Energy Sources E (2 0 0 0 1) 43,1
11.ELR 022536 Integration of Distributed Resources
in Power Systems (2 0 0 0 0) 22
12.ELR 023313 Analog and Digital
Measurement Systems (2 0 0 0 0) 22
13.ELR 023229 Electromechanical Systems
in Renewable Energy Sources (1 0 0 0 1) 2 1,1
* - ecture, - utorial, ab, P-Pro ect, S-Seminar
COURSE DESCRIPTIONS
Winter semester
1.
ELR021330 NUMERICAL AND OPTIMIZATION METHODS
Language: English Course: Basic/Advanced
Year (I), semester (1) Level: II Obligatory/Optional
Teaching:
Prerequisites: Mathematics and Matlab course
Traditional/Distance L.
Lecturer: Tomasz Sikorski, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15
Exam / Course work: Test
ECTS 2
Workload (h) 60
Outcome: Ability of optimization algorithms implementation for constrained and unconstrained problems.
Content: The course contains theoretical and practical aspects of solving optimization problems. Optimization problem formulation, examples. Mathematical models. Unconstrained and constrained problems. Solution of optimization problems: mathematical preliminaries, numerical methods. Kuhn-Tucker conditions. Lagrangian duality. Selected algorithms for constrained optimization. Linear programming, simplex method. Neural networks and Genetic algorithms for optimization.
Literature:
1. E.K.P. Chong, S.H. Żak: An Introduction to Optimization, 2nd edition, New York, John Wiley, 2001.
2. J.F. Bonnans: Numerical optimization: theoretical and practical aspects, Springer-Verlag, 2003.
3. M. Asghar Bhatti: Practical Optimization Methods, Berlin, Springer-Verlag 2000.
2.
ELR021331 POWER QUALITY ASSESSMENT
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching:Traditional/Distan
Prerequisites: Mathematics and Circuit Theory
ce L.
Lecturer: Przemyslaw Janik, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30
Exam / Course work: Course work
ECTS 2
Workload (h) 60
Outcome: Understanding of the basic phenomena and practical engineering aspects of power quality assessment in power systems.
Content: The course contains the basic problems and practical aspects of power quality assessment in power systems. After an introduction and general basis, the following problems are presented:
classes of power quality problems, standards, interruptions, voltage sags, transient overvoltages, harmonics, long duration voltage variations, flicker, power quality measurement, disturbances mitigation methods, chosen algorithms for power quality assessment. A computer-based laboratory supplements the course.
Literature:
1. Arrillaga J. Watson N. R.: Power System Quality Assessment, John Wiley & Sons, New York, 2000
2. Bollen M. H. J.: Understanding Power Quality Problems Voltage Sags and Interruptions, IEEE Press, New York, USA, 2000.
3. Dugan R. C., McGranaghan M. F., Beaty H. W.: Electrical Power Systems Quality, McGraw-Hill, New York, USA, 1986.
3.
ELR022131 POWER SYSTEM FAULTS
Language: English Course: Basic/Advanced
Year (I), semester (1) Level: II Obligatory/Optional
Teaching:
Prerequisites: Circuit Theory
Traditional/Distance L.
Lecturer: Prof. Jan Iżykowski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work/T: Exam Course work
ECTS 4 2
Workload (h) 60 30
Outcome: Gain basic knowledge regarding power system faults and basic information on the devices such as digital fault recorders and fault locators. Deep familiarization with various problems of power system faults analysis.
Content: The course consists of a lecture and project. The lecture deals with different aspects of power system faults. Fault causes and effects together with classification of faults and analysis of typical fault current wave-shape are delivered in the introduction. Then, the aims of fault calculations and use of per units are specified. The methods used in fault analysis are described. In particular it is focused on the symmetrical component method, for which equivalent diagrams of power system components are described, and then symmetrical and unsymmetrical faults in systems solidly grounded are analysed. Ground faults in networks with: isolated neutral point, neutral point earthed by the compensation reactor and neutral point earthed by the resistor are described. Reference of short-circuit calculations to the present standard is given. Basic characteristic of the devices: digital fault recorder and digital fault locator are delivered. Main issues relevant for transformation of fault currents and voltages by instrument transformers are characterized. During the project students complete individual tasks aimed at deep familiarization with the specific problems of power system faults analysis.
Literature:
1. J. D. Glover, M. Sarma: Power system analysis and design, PWS Publishing Company Boston, second edition, 1994.
2. J. L. Blackburn: Symmetrical components for power systems engineering, Marcel Dekker, New York 1993, Serie: Electrical Engineering and Electronics 85.
3. J-P. Barret, P. Bornard, B. Meyer: Power system simulation: Chapman and Hall, London 1997.
4. P. M. Anderson: Power system protection, IEEE Press, Power Engineering Series, New York 1999.
5. H. Ungrad, W. Winkler, A. Wiszniewski: Protection techniques in electrical energy systems,
Marcel Dekker Inc. New York, Basel, Hong Kong, 1995.
4.
ELR022132 DIGITAL CONTROL SYSTEMS
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Prerequisites: Completed courses: Fundamentals of Control Teaching:
Engineering 1, 2 Traditional/Distance L. Lecturer: Marek Michalik, PhD, Mirosław Łukowicz, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work: Course work Reports
ECTS 3 1
Workload (h) 30 90
Outcome: Knowledge related to the digital control algorithms design for various types of digital controllers.
Content: Structure of digital control systems, A/C and D/C conversion, conditioning and digital filtering of input signals. Direct Digital Control: PID digital regulators, robust digital regulators, fuzzy control, state variable feedback compensation, digital control with state observers.
Literature:
1. Kuo B.J.: Digital Control Systems. Hold. Reinhard and Winston Inc. 1981.
2. Santina M.S., Stubberud A.R., Hostetter G.H.: Digital Contriol Systems. Oxford University Press.1994.
3. Aufi R.: Digital Control Systems. Prentice Hall. 2004.
4. Isermann R.: Digital Control Systems. Springer-Verlag. 1997.
5.
ELR023225 DYNAMICS AND CONTROL OF DC AND AC DRIVES
Language: English Course: Basic/Advanced
Year (I), semester (1) Level: II Obligatory/Optional
Prerequisites: Control Theory-basics, Electrical Drives and Power Teaching:
Electronics Traditional/Distance L. Lecturer: Prof. Teresa Orłowska-Kowalska, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30
Exam / Course work: Exam
ECTS 4
Workload (h) 120
Outcome: Knowledge of modern control methods DC and AC motor drives; problems of sensorless drives, nonlinear controllers applications in electrical drives.
Content: Basics of control system synthesis problems for electrical drives. Control quality indexes for electrical motors, static and dynamical optimization of electrical drives. Torque control structures; adjustment criteria for linear controllers. Torque and speed control structures of electrical drives; examples of technical realizations in DC and AC drives. Scalar and vector control methods in AC drives with induction and permanent magnet synchronous motors. Field oriented
control and direct torque control of AC motors. State variables estimation for AC motor drives. Electrical drives with microprocessor control. Artificial intelligence methods in electrical drives. In laboratory tasks models and industrial solutions of automated electrical drives are demonstrated and tested.
Literature:
1. Kaźmierkowski M.P., Tunia H., Automatic Control of Converter-fed Drives, Elsevier-PWN, 1994.
2. Orlowska-Kowalska T., Bezczujnikowe układy napędowe z silnikami indukcyjnymi, Oficyna Wydawnicza P.Wr., Wrocław, 2003.
6.
ESN001500 ADVANCED TECHNOLOGY IN ELECTRICAL POWER GENERATION
Language: English Course: Basic/Advanced
Year (I), semester (1) Level: II Obligatory/Optional
Teaching:Traditional/Distan
Prerequisites: Thermodynamics
ce L.
Lecturer: Halina Kruczek, PhD, DSc, Associate Professor
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15 15
Course work
Exam / Course work/T: Exam Test
(Report)
ECTS 3 1 1
Workload (h) 90 30 30
Outcome: Knowledge, skills and design basics in the field of energy conversion and advanced power production system with new zero emission concept, nuclear resources.
Content: This course covers fundamentals of thermodynamics, chemistry, flow and transport processes as applied to energy systems. The topics include analysis of energy conversion in thermo-mechanical, thermo-chemical, processes in existing and future power systems, with emphasis on efficiency, environmental impact and performance. Systems utilizing fossil fuels, nuclear resources, over a range of sizes and scales are discussed. Applications include combustion, hybrids, supercritical and combined cycles IGCC. Tutorials and the project supplement the lecture.
Literature:
1. Fundamentals of Heat and Mass Transfer, Frank P. Incropera, David P. DeWitt, John Wiley & Sons, 1996
2. Thermodynamics and heat power, Granet, Irving., Pearson Prentice Hall, cop. 2004.
3. Energy Hndbook, Robert Loftness, 1983.
Steam its generation and use, The Bacock &Wilcox Company a McDermott company ed. By J.B. Kitto
and S.C. Stultz ed. 41, 2005.
7.
ELR021120 ADVANCED HIGH VOLTAGE TECHNOLOGY
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Prerequisites: Mathematics and Physics and Electrotechnics Teaching:
Fundamentals Traditional/Distance L. Lecturer: Prof. Bolesław Mazurek, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30
Exam /T: Exam
ECTS 3
Workload (h) 90
Outcome: Acquaintance with the modern methods of generation and measurement of high voltage. Knowledge about the application of high electric fields in industry technological processes, in agriculture, medicine and science.
Content: The course discusses the newest technology issues and knowledge necessary for electrical engineers. Generation of high voltage and high voltage measurement techniques will be discussed. Electrical field distribution and electrical field control methods will also be presented. A significant part of the lecture is the presentation of electrical discharges in gases, fluids, vacuum and solid dielectrics. The transmission DC lines and high electrical field application for technology processes will be shown as examples of practical high voltage engineering. A few laboratory trainings supplement the course, giving the possibility to measure the voltages up to a few hundred kV. Literature:
1. Haddad A., Warne D., Advances in High Voltage Enginering. Institution of Electrical Engineers 2004.
2. Kuffel E., Zaengl W.S., Kuffel J., High Voltage Fundamentals. Newnes 2003.
3. Beyer M., Boeck W., Moeller K., Zaengl W., High voltage engineering. Springer 1986.
8.
ELR022135 ARTIFICIAL INTELLIGENCE TECHNIQUES
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Prerequisites: Completed courses: Mathematics, Circuit Theory, Teaching:
Fundamentals of Control Engineering Traditional/Distance L.
Lecturer: Waldemar Rebizant, PhD, DSc, Associate Professor
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work: Course work Project
ECTS 2 1
Workload (h) 90 30
Outcome: The students are expected to present the knowledge on the theory of artificial intelligence techniques with a special attention to their application in power system protection and control problems.
Content: The course covers the following items: introduction to artificial intelligence techniques in power system control; Expert Systems – main features, structure, inference methods, strategies for conflict resolving, application fields; systems based on Fuzzy Logic – fuzzy signals, membership functions, fuzzy settings, fuzzification and defuzzification methods, multicriterial algorithms; Artificial Neural Networks – main features, neurone types, activation functions, neural network architectures, learning methods, application fields; Genetic Algorithms – evolutionary strategies, genetic modifications, application examples; Hybrid Intelligent Schemes; application examples of intelligent techniques described for power system protection and control purposes.
Literature:
1. Pao Y.A.: “Adaptive Pattern Recognition and Neural Networks”, Addison-Wesley, Reading, MA, 1989.
2. Yager R.R. and Filev D.P.: ”Essentials of Fuzzy Modelling and Control”, J. Wiley & Sons, Inc., New York, USA, 1994.
3. Ringland G.A. and Duce D.A. (ed. By): “Approaches to Knowledge Representation: An Introduction”, Research Studies Press Ltd., Wiley & Sons, Chichester, England, 1988.
4. Dillon T.S. and Niebur D. (edited by): “Neural Network Applications in Power Systems”, CRL Publishing Ltd., London, 1996.
5. Cichocki A., Unbehauen R.: “Neural Networks for Optimization and Signal Processing”, John Wiley & Sons, 1993.
9.
ELR022233 POWER SYSTEM AUTOMATION AND SECURITY
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching:
Prerequisites: completed courses: Power System Protection Traditional/Distance L.
Lecturer: Prof. Bogdan Miedziński, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Course work
Exam / Course work: Exam
(Report)
ECTS 3 1
Workload (h) 120 30
Outcome: The students are expected to present the knowledge of the transient phenomena encountered in power systems and related protection and control concepts, as well as to know what methods/systems should be applied to assure the safe operation of power systems.
Content: The course is intended to acquaint students with modern concepts in sensing and contact units components, convertors for digital protections, security problems, trends in substation automation as well as preventive and adaptive protection systems related to power system automation applications. The course describes chosen protection engineering problems of special interest to the student and provides students with a background for further study in science and applications.
Literature:
1. KTV Grattan, Sensors-technology, systems and Applications, A.Hilger IOP Publishing Ltd, 1991.
2. Power System Protection, volume 4: Digital protection and aisz ng , Short Run Press Ltd, Exeter, 1997.
3. H.Ungrad, W.Winkler, A.Wiszniewski: Protection techniques in electrical energy systems, Marcel Dekker Inc. New York, Basel, Hong Kong, 1995.
4. Selected papers published in renowned international journals.
10.
ELR022532 ELECTRICAL POWER SYSTEMS MANAGEMENT
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching:Traditional/Distan
Prerequisites: Power Systems
ce L.
Lecturer: Prof. Artur Wilczyński, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 15
Exam / Course work/T: Test Course work
ECTS 1 1
Workload (h) 30 30
Outcome: Knowledge about management problems at energy companies after power system restructuring.
Content: Organization of power sector. Introduction to the deregulation and restructuring of power sector. Development of electricity market. Examples of electricity markets. Tasks of transmission and distribution system operators. Regulation of the electricity industry. Organization of access to the system. Electricity price mechanism, transmission pricing principles. System planning under competition.
Literature:
1. Malko J., Wilczyński A., Rynki energii – działania marketingowe. Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław 2006.
2. S. Hunt, G. Shuttleworth: Competition and choise in electricity, John Wiley & Sons, Chichester – New York – Weinheim – Brisbane – Singapore – Toronto, 1997.
3. M. Ilic, F. Galiana, L. Fink: Power systems restructuring, engineering and economics, KLUWER Academic Publishers, Boston – Dordrecht – London, 1998.
4. Directive 2003/54/EC of the European Parliament and of the Council, of 26 June 2003, concerning common rules for the internal market in electricity and repealing Directive 96/92/EC.
5. Philipson L., Willis H. L.: Understanding Electric Utilities and De-Regulation. Marcel Dekker, Inc., New York 1999.
11.
ELR023311 ELECTROMAGNETIC COMPATIBILITY
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Prerequisites: Completed courses: Mathematics, Circuit Theory and Teaching:
High Voltage Engineering Traditional/Distance L. Lecturer: Grzegorz Kosobudzki, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work/T: Course work Course work
ECTS 2 1
Workload (h) 60 30
Outcome: Acquaintance with practical aspects of EMC and power quality in power delivery systems.
Content: The course contains the basic problems and practical aspects of electromagnetic compatibility
Content: The course contains the basic problems and practical aspects of electromagnetic compatibility EMC. The following problems are presented: electromagnetic disturbances caused by lighting strikes and electrostatic discharges; EMC phenomena generated by converter fed drives; methods of electrical and electronic equipment protection from overvoltages and overcurrents; aspects of electromagnetic shielding; power quality parameters, requirements, standards; influence of power quality phenomena on equipment; non-linear devices influence on power quality; disturbances mitigation techniques; harmonics reduction; measurements
Literature:
1. Hasse P.: Overvoltage protection of low voltage systems, TJ International, Padstown, 2000.
2. Pradas Kodali V.: Engineering Electromagnetic Compatibility Principles, Measurments and Technology, IEEE Press, New York, 1996.
3. Dugan R. C., McGranaghan M. F., Beaty H. W.: Electrical Power Systems Quality, McGraw-Hill, New York, USA, 1986.
12.
ELR023312 ADVANCED MEASUREMENTS IN ELECTRIC POWER ENGINEERING
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching: T
Prerequisites: Mathematics and Circuit Theory
raditional/Distance L.
Lecturer: Prof. Nawrocki Zdzisław, PhD, DSc, Prof. Fleszyński Janusz, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30
Exam / Course work: Course work
ECTS 2
Workload (h) 90
Outcome: Knowledge of the basic problems and practical aspects of analogue and digital measurement.
Content: The course deals with the basic problems and practical aspects of analogue and digital measurement. After introduction and general theoretical part, the following practical problems are presented: problems measurement of voltage and current (DC and AC), and power meter. The following subjects are presented in the course: high voltage measurements, diagnostic tests of high voltage equipment insulation. The course familiarizes the students with the measurement and generation methods of high voltage, the partial discharges investigation. Special emphasis is put on the presentation of the physical and metrological fundaments of different kinds of diagnostic test (electric, acoustic, optoelectronic, physico-chemical), the detection of various defects in the insulation of equipment, problems of modern diagnostic.
Literature:
1. J. Mc. Ghee, I.A. Henderson, M. J. Korczyński, W. Kulesza: Scientific metrology, Technical University of Lodz, Lodz, 1998.
2. Mc. Ghee, I. A. Henderson, M.J. Korczyński, W.Kulesza: Measurement data handling, vol. 1 and vol.2, Technical University of Lodz, Lodz, 2001
3. N. Kularanta: Digital and analogue instrumentation. IEE, London, 2003.
4. D. Kind: An introduction to high voltage experimental technique, Vieweg 1980.
5. E. Kuffel, W.S. Zaengel, J. Kuffel: High Voltage Engeneering Fundaments, Elsevier, 2000
13.
ELR023228 POWER ELECTRONICS
Language: English Course: Basic/Advanced
Year (I), semester (1) Level: II Obligatory/Optional
Teaching: Traditional
Prerequisites: Electronics supporting e-learning
/Distance L.
Lecturer: Zbigniew Załoga, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30
Exam / Course work/T: Course work
ECTS 3
Workload (h) 60
Outcome: intensify theoretical and practical knowledge about power electronics systems.
Content: Contains: semiconductor power switchers: SCR, TRIAC and BJT, MOSFET, IGBT. Complementary components and systems. Converters: rectifiers, AC-controllers, choppers, inverters. Common application of converters, also for renewable energy sources systems. Literature:
1. N. Mohan, T. M.Undeland, W.P. Robbins, Power Electronics. Converters, Applications, Design, John Wiley & Sons, Inc. 1995
2. A.M. Trzynadlowski, Introduction to Modern Power Electronics, John Wiley & Sons, Inc. 1998
14.
ELR021337 PHOTOVOLTAIC CELLS (E)
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching:
Prerequisites: Power Systems
Traditional/Distance L.
Lecturer: Przemysław Janik, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30
Exam / Course work/T: Exam
ECTS 3
Workload (h) 60
Outcome: Introduce photovoltaic effect and the operation principles of photovoltaic cells; Describe the fabrication technology of photovoltaic cells and photovoltaic batteries, their characteristics and parameters; Identify the effect of various factors on the conversion efficiency of photovoltaic devices; Explain construction and production steps of photovoltaic modules; Describe transformation and storage of electrical energy from photovoltaic modules; Discuss construction of concentrating solar power systems.
Content: Introduction of basic concepts and energy units; Identification of energy sources, analysis of energy resources and their influence on the environment; Characterization of the solar radiation and the properties of the earth’s atmosphere; Description of the photovoltaic effect and the basic physical models of solar cells. Review of technologies, the parameters and characteristics of the photovoltaic cells; Description of factors affecting efficiency of photovoltaic energy conversion; Description of construction and production steps for photovoltaic modules, methods of energy storage and conversion.
Literature:
1. S.R. Wenham, M.A. Greek, M.E. Watt, R. Corkish,, Applied Photovoltaics, Earthscan, London 2009
2. J.D. Myers, Solar Applications In Industry and Commerce, Prentice-Hall, New Jersey 1984
3. V.D. Hunt, Handbook of Conservation nad Solar Energy, Van Nostrand Reinhold, New York 1982
15.
ELR021338 INDUSTRIAL ECOLOGY – SELECTED ISSUES (E)
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching:
Prerequisites: none
Traditional/Distance L.
Lecturer: Zbigniew Leonowicz, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 15
Course work
Exam / Course work/T: Exam
(Presentation)
ECTS 1 1
Workload (h) 60 30
Outcome: Knowledge of various aspects of industrial ecology. Capability of analysis and recognition of problems related to waste reduction and modeling of industrial processes in accordance with principles of laws of nature.
Content: Fundamentals of industrial ecology- the science of sustainability in industrial and engineering problems. The aims of industrial ecology are: minimizing of energy and materials usage, ensuring acceptable quality of life, minimizing the ecological impact of human activity & maintaining the economic viability of systems.
Literature:
1. Allenby B, Allenby R, Deanna J.: The Greening of Industrial Ecosystems, National Academy Press, Washington, 1994
2. IEEE White Paper on Sustainable Development and Industrial Ecology, IEEE 1995
3. Frosch R.A., “Industrial Ecology: A Philosophical Introduction,” Proceedings of the National Academy of Sciences, USA 89 (February 1992): 800–803
16.
LEGAL REGULATIONS AND INVESTMENTS IN POWER SYSTEMS
ELR022537
WITH DISTRIBUTED ENERGY SOURCES
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Teaching:
Prerequisites: Power Systems
Traditional/Distance L.
Lecturer: Prof. Artur Wilczynski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 15
Course work
Exam / Course work/T: Course work
(Presentation)
ECTS 2 1
Workload (h) 30 30
Outcome: Obtaining knowledge in the field: national and union legal, technical and economical conditions for construction of technological systems using renewable energy sources for supply as well as principles of investment project for distributed and dispersed generation.
Content: The fundamentals of legal regulations in the field of usage of renewable energy sources in power systems, European Union and national legal documents in the field of renewable energy sources, principles of well-balanced expansion. Renewable energy sources on electricity and heat markets as well as investment process in distributed and dispersed power systems with application of renewable energy sources (conception, formal and legal requirements, financing, realization) are also discussed.
Literature:
1. G. Boyle: Renewable Energy – Power for a sustainable future, Second Edition, Oxford
University Press Inc. New York, 2004
2. T. Burton, D. Sharpe, N. Jenkins, E. Bossanyi: Wind Energy Handbook, John Wiley and Sons Ltd. Chichester, England, 2001.
3. A. Luque, S. Hegedus: Handbook of photovoltaic science and engineering, John Wiley and Sons Ltd. Chichester, England, 2003.
4. T. Markvart: Solar electricity, Second Edition, UNESCO, John Wiley and Sons Ltd. New York, 2000.
17.
ELR022110 MODELLING OF ELECTRICAL MACHINES
Language: English Course: Basic/Advanced
Year (II), semester (3) Level: II Obligatory/Optional
Prerequisites: Completed basic courses of Engineering Graphics, Informatics and Electrical Engineering Teaching: Traditional with using of multimedia /Distance L.
Lecturer: Krzysztof Makowski, PhD, DSc, Associate Professor
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 30
Exam / Course work/T: Course work Course work
ECTS 1 2
Workload (h) 30 60
Outcome: To learn principles of present-day methods of electromagnetic modeling of electrical machines.
Content: Mathematical grounds of electromagnetic field theory, electromagnetic quantities and Maxwell’s equations. Outline of the finite element method (FEM) and its application to magnetostatic and magnetodynamic linear and non-linear problems. Field-circuit equations of electromechanical converters with taking into account movement of moving parts of the converter. Methods for calculation of electromagnetic torque and power losses. General rules for formulation of field models of the converters and presentation some examples of aisz ng of electromechanical converters by FEM.
Literature:
1. Di Barbra P., Savini A., Wiak S. : Field models in electricity and magnetism, Springer, 2008
2. Bolkowski S. i inni : Komputerowe metody analizy pola elektromagnetycznego, WNT, Warszawa, 1993
3. Hameyer K., Belmans R.: Numrical modeling and design of electrical machines and devices, WITT Press, Southampton, 1999
4. Lowther D.A., Silvester P.P.: Computer aided design in magnetics, Springer-Verlag, Berlin Heidelberg New York Tokyo, 1986.
5. Silvester P.P., Ferrari R.L.: Finite elementsfor electrical engineers, Cambridge University Press, Cambridge, 1983.
18.
ELR022334 ENERGY STORAGE SYSTEMS
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisites: Electrical devices
Traditional/Distance L.
Lecturer: Kazimierz Herlender, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 30
Course work
Exam / Course work/T: Test (Project
documentation)
ECTS 2 1
Workload (h) 60 30
Outcome: Main aims of the course are get to know of different kinds enrgy storage systems and basic of battery energy storage design.
Content: Classification and main characteristic different kinds of electrical energy storage in power system, such as: pumped hydro energy storage, flywheel systems, compresses air systems (CAES), fuel cell, Superconducting Magnetic Energy Storage (SMES), ultra capacitors and Battery Energy Storage (BES),. Compared the main parameters this energy storage and given possible areas their applications.
Literature:
1. Batterie-Energiespeicher in der Elektrizit tsversorgung – Kompendium, H.-J. Haubrich [Hrsg], Verlag Mainz, Aachen 1996
2. Proceedings of EU-Project ICOP-DISS-2140-96, Distributed Energy Storage for Power Systems, Pod red. Feser K., Styczyński Z. A., Verlag aisz, Aachen 1998.
3. Markiewicz H. Urządzenia elektroenergetyczne. WNT, Warszawa 2001.
Summer semester
1.
ELR1311 SELECTED PROBLEMS OF CIRCUIT THEORY
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Mathematics and Differential Equations and Linear Teaching:
Algebra and Basic Circuit Theory Traditional/Distance L. Lecturer: Tomasz Sikorski, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work: Exam Test
ECTS 3 1
Workload (h) 90 60
Outcome: Ability of carrying out synthesis of electrical circuits with the optimization approach, knowledge about phenomena in nonlinear circuits, selected methods of analysis.
Content: The course deals with selected problems of Synthesis of Linear Circuits & Systems, as well as Analysis of Nonlinear Electrical Circuits - theoretical and practical aspects of linear circuits design based on different methods and requirements. Furthermore, the course discusses the aspects of nonlinear circuits’ analysis and structures, with practical examples and exercises.
Literature:
1. L.A. Chua, C.A. Desoer, E.S. Kuh: Linear ad Nonlinear Circuits, New York: McGraw-Hill Book Co., 1987.
2. H. Baher: Synthesis of Electrical Networks, New York: J. Wiley, 1984.
3. F. Kouril, K. Vrba.: Non-Linear And Parametric Circuits: Principles, Theory And Applications, Chichester: Ellis Horwood, 1988.
2.
ELR2110 SIMULATION AND ANALYSIS OF POWER SYSTEM TRANSIENTS
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Linear Algebra, Differential Equations, Numerical Teaching:
Methods for Linear and Nonlinear Equations Traditional/Distance L. Lecturer: Prof. Eugeniusz Rosołowski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 15
Exam / Course work: Course work Course work
ECTS 1 1
Workload (h) 30 30
Outcome: Understanding of digital models for the simulation of electromagnetic transients in complex three-phase electric networks, ability of applying the models for practical problems in power systems.
Content: The course consists of a lecture and a project. Both of these forms deal with the following problems: modelling of physical systems - basic principles; numerical oscillation and accuracy of discrete models; digital models of basic electric elements with lamped and distributed parameters; models of selected three-phase system elements: lines, transformers, generators; models of non-linear electric elements: diodes, thyristors, varistors and non-linear inductance; numerical methods used in EMTP program for linear and non-linear network equation solution; EMTP application to simulation of selected problems with using of basic network elements: transmission line, transformer, generator, instrument transformers; using of ATP Draw program for the preparation of simulation cases; using of MODELS module for the simulation of auxiliary procedures: measurement, control and protection; analysis of simulation results: PLOTXY program; EMTP-MATLAB interface. During the laboratory students complete individual tasks aimed at deep familiarization with the specific problems of electromagnetic transients analysis in power systems. Literature:
1. N. Watson, J. Arrillaga: Power systems electromagnetic transients simulation. The Institution of Electrical Engineers, London 2003.
2. H.W. Dommel: Electromagnetic Transients Program. Reference Manual. BPA, Portland, 1986.
3. J. D. Glover, M. Sarma: Power system analysis and design, PWS Publishing Company Boston, second edition, 2002.
4. W. D. Stevenson: Elements of Power System Analysis (4th Ed.). McGrawHill, New York, 1982.
5. J-P. Barret, P. Bornard, B. Meyer: Power system simulation: Chapman and Hall, London 1997.
3.
ELR2111 DIGITAL SIGNAL PROCESSING FOR PROTECTION AND CONTROL
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Completed courses: Mathematics and Circuit Theory Teching:
and Informatics Traditional/Distance L. Lecturer: Prof. Andrzej Wiszniewski, PhD, DSc, Waldemar Rebizant, PhD, DSc, Associate Professor
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Course work
Exam / Course work/T: Exam
(Report)
ECTS 4 1
Workload (h) 120 30
Outcome: As an effect of the course completion, the students are expected to possess the knowledge on the theory of digital signal processing as applied to power system control and protection systems. The students should show the ability of choosing proper algorithms of signal processing for given practical problems encountered in power system protection and control.
Content: The course deals with basic problems and practical aspects of digital signal processing for power system protection and control. After an introduction and general theoretical and numerical basis, the following practical problems are presented: analog filtration, A/D conversion, digital filtration (FIR & IIR filters design and parameters), estimation of signal parameters (criterion values), decision making methods and algorithms, chosen algorithms of power system control, integrated measurement and control systems. A computer-based laboratory supplements the course.
Literature:
1. H. Ungrad, W. Winkler, A. Wiszniewski: “Protection techniques in electrical energy systems”, Marcel Dekker Inc. New York, Basel, Hong Kong, 1995.
2. T. Krauss, L. Shurc, J. Little: Signal processing toolbox for use with Matlab, Users Guide.
3. L.B. Jackson: Digital filters and signal processing, Kluwer Academic Publishers, Boston, 1986.
4.
ELR2210 POWER SYSTEM PROTECTION
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Grounded knowledge on electricity and a passing grade of: Electrical Measurement, Electrical Machines, Electrical Devices and Electrical Power Systems
Lecturer: Prof. Bogdan Miedziński PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work/T: Exam Course work
ECTS 4 2
Workload (h) 120 60
Outcome: Understanding principles of protective relaying in power system, ability of designing and setting of protection schemes for various power system elements.
Content: Objects and tasks of power system protection. Main requirements regarding power system relaying. Converters of measurement quantities for protection needs. Main criteria for detecting faults. Relaying principles of main power system elements, i.e.: generators, transformers, high voltage motors, distribution and transmission lines. Power system protection in preventive and restoration mode – objectives and general application principles.
Literature:
1. Ungrad H., Winkler W., Wiszniewski A., Protection techniques in electrical energy systems, Marcel Dekker Inc., New York 1995.
2. Horowitz S. H., Phadke A.G., Power system relaying, RSP England 1992.
3. Praca zbiorowa pod red. B. Synala, Automatyka elektroenergetyczna, ćwiczenia laboratoryjne, część I: Przetworniki sygnałów pomiarowych i przekaźniki automatyki zabezpieczeniowej, część II: Układy automatyki zabezpieczeniowej i regulacyjnej skrypt Politechniki Wrocł., Wrocław 1991.
5.
ELR2211 FIBER OPTICS COMMUNICATIONS AND SENSORS
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisits: Applied Physics, Electronics and Electromagnetic Theory Traditional/Distance L.
Lecturer: Prof. Bogdan Miedziński, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work: Course work Course work
ECTS 2 1
Workload (h) 60 30
Outcome: Acquaintance with the problems of processing and transmission of information by means of the fibre optic technique and applicability of fibre sensors in practice.
Content: Wave theory of light propagation. Signals transmission and processing. Problems of generation and detection. Communication systems; expanding the system capacity by multiplexing. Optical phenomena used in fibre sensors, applicability of right and remote sensors in practice
Literature:
1. Chai Yeh:”Handbook of fiber optics-theory and applications”,Accademic Press.Inc.London 1990.
2. J.L.Hornet:”Optical signal processing”Academic Press Inc.London,1987
3. CIGRE Working Group 35.04:”Optical fibre cable selection for electricity utilities”,Febr.2001.
6.
ELR2312 RENEWABLE ENERGY SOURCES
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisites: Applied Statistics
Traditional/Distance L.
Lecturer: Prof. Zbigniew Styczyński, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Course
Exam / Course work/T: Course work
work
ECTS 2 1
Workload (h) 60 30
Outcome: Understanding of problems concerned with renewable energy sources.
Content: The course deals with the basic problems and practical aspects of renewable energy sources. After an introduction and general theoretical basis, the following problems are presented: wind energy, solar energy, biomass energy, geothermal energy and wave energy. Presentations contain: introduction, scientific principles of work, energy conversion, advantages and disadvantages, technology, applications, examples of energy projects, economics, environmental impacts and benefits. The seminar supplements the course.
Literature:
1. J. Twidell, T. Weir: Renewable Energy Resources, Seventh Edition, Spon Press, London, 2005.
2. T. Burton, D. Sharpe, N. Jenkins, E. Bossanyi: Wind Energy Handbook, John Wiley and Sons Ltd. Chichester, England, 2001.
3. Luque, S. Hegedus: Handbook of photovoltaic science and engineering, John Wiley and Sons Ltd. Chichester, England, 2003.
7.
ELR2518 ELECTRIC POWER SYSTEM OPERATION AND CONTROL
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisites: Programming in Matlab and Electric Power Systems Traditional/Distance L.
Lecturer: Prof. Marian Sobierajski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work: Test Test
ECTS 4
Workload (h) 120
Outcome: Knowledge of control and regulation of voltage and frequency in transient states. Content: Steady-state and short-circuit analysis. Voltage regulation and voltage stability. Exciters and voltage regulators. Speed regulators. Dynamic and transient stability.
Literature:
1. Machowski J., Bialek J. W., Bumby J. R., Power System Dynamics and Stability. John Wiley and Sons 1997.
2. Sobierajski M., Łabuzek M., Lis R., Electric Power System Analysis in Matlab, Wroclaw University of Technology, 2007.
8.
ELR2141 PROTECTION AND CONTROL OF DISTRIBUTED ENERGY SOURCES
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisites: Power system faults
Traditional/Distance L.
Lecturer: Prof. Eugeniusz Rosołowski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 15 15
Course
Exam / Course work/T: Exam Course work
work
ECTS 3 1 1
Workload (h) 90 30 30
Outcome: The course provides descriptions of protection relaying techniques applied in distributed generation networks.
Content: The course consists of the lecture, lab and seminar. All these forms deal with the following problems: The purpose of power system protection. Basic protection criteria and main characteristics. Distributed generation: overview of applied energy sources. Line, transformer and generator protection. Interconnection systems: solutions and requirements. Loss of mains protection: applied criteria for islanding detection. Protection of photovoltaic sources. In this course the focus is on the issues relating to the power system protection including both the network protection and the protection of distributed generation.
Literature:
1. Jenkins N. Allan R., Crossley P., Kirschen D., and Strbacet G., Embedded generation. The Institution of Electrical Engineers, London 2000.
2. Anderson P.M., Power System Protection, McGraw-Hill, IEEE Press, 1999
3. Bergen A.R., Vittal V., Power systems analysis. Prentice Hall, Upper Saddle River, N.J., 2000.
4. Patel M.R., Wind and Solar Power Systems. CRC Press, Boca Raton 1999.
9.
ELR2343 WATER POWER PLANTS
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisites: Electrical devices
Traditional/Distance L.
Lecturer: Kazimierz Herlender, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Course
Exam / Course work/T:: Course work
work
ECTS 2 1
Workload (h) 60 30
Outcome: To get basic knowledge of design, building and exploiting of hydro power stations. Content: The scope of the obeys problems of design, building and exploiting of hydro power stations, including estimation of hydrology potential of water, construction of basic hydro technical and electrical equipment, classification of hydro plants and types of turbines, basic problems of hydro plants automation and control (including control of turbines and generators); problems of planning of small hydro plants – law, procedures, feasibility studies.
Literature:
1. Bobrowicz Władysław, Small Hydro Power – Investor Guide Leonardo Energy, Utilisation Guide Section 8 – Distributed Generation, Autumn 2006
2. Harvey A., Micro-hydro power, 2004,
3. Allan. Undershot Water Wheel. 2008
4. Shannon, R. Water Wheel Engineering. 1997
5. Pacey, A. Technology in World Civilization: A Thousand-year History, 1997
10.
ELR2345 RENEWABLE ENERGY SOURCES
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Teaching:
Prerequisites: Applied Statistics
Traditional/Distance L.
Lecturer: Prof. Zbigniew Styczyński, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Course
Exam / Course work/T: Exam
work
ECTS 3 1
Workload (h) 90 30
Outcome: Understanding of problems concerned with renewable energy sources.
Content: The course deals with the basic problems and practical aspects of renewable energy sources. After an introduction and general theoretical basis, the following problems are presented: wind energy, solar energy, biomass energy, geothermal energy and wave energy. Presentations contain: introduction, scientific principles of work, energy conversion, advantages and disadvantages, technology, applications, examples of energy projects, economics, environmental impacts and benefits. The seminar supplements the course.
Literature:
4. J. Twidell, T. Weir: Renewable Energy Resources, Seventh Edition, Spon Press, London, 2005.
5. T. Burton, D. Sharpe, N. Jenkins, E. Bossanyi: Wind Energy Handbook, John Wiley and Sons Ltd. Chichester, England, 2001.
6. Luque, S. Hegedus: Handbook of photovoltaic science and engineering, John Wiley and Sons Ltd. Chichester, England, 2003.
11.
ELR2541 INTEGRATION OF DISTRIBUTED RESOURCES IN POWER SYSTEMS
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Electrical Power System, Power system faults, Power Teaching:
system protection, Energy Production Traditional/Distance L. Lecturer: Prof. Marian Sobierajski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work/T: Test Course work
ECTS 3 1
Workload (h) 90 30
Outcome: Familiarize students with problems and technical aspects for integrating of distributed energy resources in power systems.
Content: Classification of distributed energy resources (DER). Aimed level penetration of DER in electric power system. Wind generation. Modeling of DER. Schemes and points of connection of distributed generation to a distribution system. Load flow and short circuit simulation in electric power network with dispersed generation. Analysis of impact of distributed generators on power load flow, short-circuit currents, voltage changes, power quality and protection of distribution network. Technical requisites for producer connection to the public electric power grids. Influence of DER on frequency regulation in electric power system. Autonomous operation of distributed generators. Microgrids.
Literature:
1. Jenkins N., Allan R., Crossley P., Kirschen D., Strbac G.: Embeded Generation. Power & Energy 2000.
2. Loi Lei Lai, Tze Fun Chan: Distributed Generation. 2007 John Wiley & Sons, Ltd.
12.
ELR3352 ANALOGUE AND DIGITAL MEASUREMENT SYSTEMS
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Basis of electrical engineering, Basis of electronics, Teaching:
electrical measurements Traditional/Distance L. Lecturer: Daniel Dusza, PhD
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 30 15
Exam / Course work/T:: Test Course work
ECTS 2 1
Workload (h) 60 30
Outcome: Providing of knowledge and skills in concerning analog and digital measuring systems design
Content: Functional diagrams of measuring systems uses in renewable energy sources; converting signals – sensor types, analog and digital blocks of transducers using in renewable energy
measurements; principle of construction of measuring system to measure: wind speed, wave energy, passive solar building energy, temperatures, noises, flows, vibrations; control and processing equipment, programmable instruments; evolution of renewable energy measuring systems.
Literature:
1. Clayton G., Winder S.: Operational amplifiers, Newnes, Oxford, 2003.
2. Horowitz P., Hill W., The art of electronics, Cambridge University Press, New York, 2007.
3. Jung W., IC Op Amp cookbook, Prentce-Hall, PTR, 1999.
4. Jung W., Op Amp applications, Handbook, Elsevier/Newnes, Oxford 2006.
5. Lyons R.G., Understanding digital signal processing, Addison Wesley Longman, 1997.
13.
ELR023229 ELECTROMECHANICAL SYSTEMS IN RENEWABLE ENERGY SOURCES
Language: English Course: Basic/Advanced
Year (I), semester (2) Level: II Obligatory/Optional
Prerequisites: Basis of electrical engineering, Basis of electronics, Teaching:
electrical measurements Traditional/Distance L. Lecturer: Prof. Krzysztof Pienkowski, PhD, DSc
Lecture Tutorials Laboratory Project Seminar
Hours / sem. (h) 15 15
Exam / Course work/T:: Test Presentation
ECTS 1 1
Workload (h) 30 30
Outcome: Providing of knowledge on electromechanical aspects of frequency converter application in renewable energy systems.
Content: The course is concerned on electromechanical aspects of frequency converter structures applied for conversion of primary energy and on electromechanical processes which appear in renewable energy systems working in autonomic systems as well as in the ones integrated with power systems. Course is supplemented by seminar where students prepare presentations on the subjects related to assessment of particular energy conversion networks applied in renewable energy systems.
Literature:
[1] Anaya-Lara O., Jenkins N., Ekanayake J., Cartwright P., Hughes M.: Wind Energy Generation. Modelling and Control. John Wiley & Sons, 2009.
[2] Burton T., Sharpe D., Jenkins N., Bossanyi E.: WIND ENERGY HANDBOOK. John Wiley & Sons, 2001.
[3] Johnson G. L.: WIND ENERGY SYSTEMS. Manhattan, KS. Electronic Edition, 2001.
[4] Vas P.: Electrical Machine and drives. A space-vector theory approach. Oxford University
Press, 1992.
Supplementary reading:
[1] White D.C., Woodson H.M.: Electromechanical Energy Conversion. New York, John Wiley & Sons, 1959.
[2] Seely S..: Electromechanical Energy Conversion. New York, McGraw Hill, 1962.
[3] Krause P.C.: Analysis of electric machinery. McGraw Hill, 1986
CALIBRATION OF PROCESS ALGEBRA MODELS OF DISCRETELY OBSERVED STOCHASTIC BIOCHEMICAL SYSTEMS
Paola Lecca
The Microsoft Research – University of Trento
Centre for Computational and Systems Biology
Piazza Manci 17, 38123 Povo (Trento), Italy
lecca@cosbi,eu
ABSTRACT
We present a maximum likelihood method for inferring kinetics of stochastic systems of chemical reactions, given discrete time-course observations of the abundance of either some or all of the molecular species and a BlenX model of the system. BlenX is a process calculus providing a tool and algebraic laws for a high-level description of interactions, communications, and synchronizations between processes representing the biomolecules. BlenX offers an efficient alternative to differential equations, but it poses different challenges to the model calibration. The main difficulty is the sampling of the reaction pathways between two observed states. We define a maximum likelihood function in terms of reaction propensities and we estimate it by sampling the intermediate pathways from the transition system of a BlenX. The method of sampling the transition system is inspired to the elementary mode analysis. Our method is illustrated with the example of a BlenX model of chaperone-assisted protein folding.
Keywords: BlenX, parameter estimation, maximum
likelihood estimation.
REFERENCES
CoSBi. (2011). Retrieved from http://www.cosbi.eu.
Boys, R. J., Wilkinson, D. J., & Kirkwood, T. L. (2008). Bayesian inference for a discretely observed stochastic kinetic model. Stat. Comput., 18, 125-135.
Cardelli, L. (2005). Brane Calculi - Interactions of B iological Membranes. Workshop on Computatinal Methods in sSystems Biology (CMSB'04) Lecure Notes in Computer Science. 3082, pp. 257-278. Springer.
Colvin, J., Monine, M. I., Faeder, J. R., Hlavacek, W. S., Von Hoff, D. D., & POsner, R. G. (2009). Simulation of large-scalle rule-nased models. Bioinfromatics, 25(7), 910-917.
Danos, V., & Krivine, J. (2004). Reservible communicating systems. CONCUR 2004. 3170 of LNCS, pp. 292-307. Springer-Verlag.
Danos, V., & Laneve, C. (2004). Formal molecular biology. TCS.
Dematté, L., Priami, C., & Romanel, A. (. (2008). The BetaWorkbench: a computational tool to study the dynamics of biological systems. 9(5), 437-448.
Dematté, L., Priami, C., & Romanel, A. (. (2008). The BlenX language: a tutorial. Trento, Italy: The Microsoft Research - University of Trento centre fro Computational and Systems Biology.
Gillespie, D. T. (1977). Exact stochastic simulation of coupled chemical reactions. The Journal of Chemical Physics, 81, 2340–2361.
Gilmore, S., & Hillstone, J. (1994). The PEPA Workbench: A Tool to Support a Process Algebra-based Approach to Performance Modelling. Lecture Notes in Computer Science, 794, 353-368.
Lecca, P. (2011). BlenX models of alpha-synuclein and parkin kinetics in neuropathology of Parkinson's disease. Journal of Biological Systems, 19(2).
Lecca, P., Palmisano, A., Ihekwaba, A. E., & Priami, C. (2010). Calibration of dynamic models of biological systems with KInfer. European Journal of Biophysics, 39(6), 1019.
Priami, C. (1995). Stochastic p-calculus. The Computer Journal, 38, 578-589.
Regev, A., Panina, E. M., Silverman, L., Cardelli, L., & Shapiro, E. (2004). Bioambients: an abstraction for biological compartments. Theor. Comput. Sci., 325(1), 141-167.
Wang, Y., Christley, S., Mjolsness, E., & Xie, X. (2010). Parameter infernece for discretely observed stochastic kinetic models using gradient descendent. BMC Systems Biology, 4(99).
AUTHORS BIOGRAPHY
Dr. Paola Lecca received a Master Degree in Theoretical Physics from the University of Trento (Italy) and a PhD in Computer Science from the International Doctorate School in Information and Communication Technologies at the University of Trento (Italy). Currently Paola Lecca is the Principal Investigator of the Inference and Data manipulation research group at The Microsoft Research – University of Trento Centre for Computational and Systems Biology (Trento,, Italy). Dr. Paola Lecca’s research interests include stochastic biochemical kinetic, biological networks inference, optimal experimental design in biochemistry, and computational cell biology. She designed prototypes for biological model calibration and for the simulation of diffusion pathways in cells and tissues. She has published articles in leading medical, biological and bioinformatics Journals and Conferences (http://www.cosbi.eu/).
43
Journal of Applied Science and Agriculture, 9(11) Special 2014, Pages: 63-68
AENSI Journals
Journal of Applied Science and Agriculture
ISSN 1816-9112
Journal home page: www.aensiweb.com/JASA
21st Century Core Soft Skills Research Focus for Integrated Online Project Based Collaborative Learning Model
1Sharifah Nadiyah Razali, 2Hanipah Hussin and 1Faaizah Shahbodin
1Faculty of Information and Communication Technology, Malaysia Technical University Malacca, Hang Tuah Jaya, 76100, Durian
Tunggal, Melaka, Malaysia
2Centre of Language and Islamic, Malaysia Technical University Malacca Hang Tuah Jaya, 76100, Durian Tunggal, Melaka, Malaysia
ARTICLE INFO ABSTRACT
Article history:
Received 25 June 2014
Received in revised form
8 July 2014
Accepted 10 August May 2014
Available online 30 August 2014
Keywords:
Soft skills, 21st century learning
21st century skills, Online
collaborative learning Background:Unemployed graduates have become the cause of anxiety in Malaysia. Even though these graduates have excellent academic skills, this does not guarantee them getting a job; due to the fierceness of competition in the current career market. Academic achievement is not today's primary criteria for getting a job, because most employers are looking for good soft skills as selection criteria for choosing new employees.Objective: This study aims to determine the core soft skills that are related to 21st century learning skills, which will be the focus of an Integrated Online Project Base Collaborative Learning model. Several previous study reports, conference proceedings, and journals have been referred to as a literature review, and analysed with the data collected using a matrix table. Results:The results show that there are four core domains for 21stcentury learning, based on ISTE, (2000), EnGauge, (2003), Seven Cs, (2006), and P21, (2006) framework. Furthermore, there are collaborations, communications, problem-solving, and critical thinking soft skills. Conclusion: This study will continue to focus on these four core skills for the Integrated Online Project Base Collaborative Learning model to be used in the next research level.
© 2014 AENSI Publisher All rights reserved.
To Cite This Article: Sharifah Nadiyah Razali, Hanipah Hussin and Faaizah Shahbodin., 21st Century Core Soft Skills Research Focus for Integrated Online Project Based Collaborative Learning Model. J. Appl. Sci. & Agric., 9(11): 63-68, 2014
INTRODUCTION
Soft skills are particular abilities that can improve employment performance and career prospects. Soft skills were defined by Moss & Tilly (2001) as 'skills, abilities and traits associated with personality, attitude and behaviour, that are different from skills in the form of formal or technical knowledge'. Meanwhile, Hurrell (2009) defined soft skills as „involving interpersonal and intrapersonal abilities to facilitate the performance of control in certain contexts'. Harvey, Locke and Morey (2004) and Ahmad, Ali and Hamzah (2011) proposed that employability assets consist of knowledge, skills, and attitudes. Most employers are primarily looking for good soft skills over academic achievements, as selection criteria for selecting new employees.Research by Juen, Pang and Vitales (2010), to collect feedback from industries, shows that Malaysian polytechnic students did not meet the levels of competency and working attitudes expected by them. Several interview sessions, which were made with program heads from Politeknik Ibrahim Sultan, PoliteknikMerlimau, PoliteknikTuanku Syed Sirajuddin, Politeknik Kota Kinabalu, and Politeknik Sultan Idris Shah, found that soft skills werethe main factor for why graduates were unemployed(Razali et al., 2014). Public opinion often refers to the failure of graduates getting employment as them not having the soft skills required by employers. Soft skills are deemed as being highly attractive in industry. Therefore, the role of Higher Educational Institutions (HEIs) is to provide training to students; with soft skills being in accordance with job demands.
In 21st Century Learning, students use educational technologies to apply knowledge to new situations, to analyse information, to collaborate, to solve problems, and to make decisions(Razali et al., 2013). Utilising emerging technologies to provide expanded learning opportunities is critical to the success of future generations. It can improve student's options and choices and help to improve student completion and achievements.Greenhill (2009) listed advantages of 21stCentury Learning Environments, such as:
• Provide infrastructure, human resource and learning materials that will support the 21st century learning environment in order to produce the 21stcentury skills needed.
Corresponding Author: Sharifah Nadiyah Razali, Faculty of Information and Communication Technology, Malaysia Technical University Malacca, Hang Tuah Jaya, 76100, Durian Tunggal, Melaka, Malaysia E-mail: shnadiyah@yahoo.com
64 Sharifah Nadiyah Razali et al, 2014
Journal of Applied Science and Agriculture, 9(11) Special 2014, Pages: 63-68
• Provide professional learning communities that's enable educators to collaborate and share best practices in integrating 21stcentury skills into classroom practice.
• Enable students to learn in the real context of the 21stcentury through project-based learning or other applied work.
• Allow equal access to quality learning tools, technologies, and resources.
• Provide 21stcentury architectural and interior designs for group, team, and individual learning.
• Support local and international community's involvement in 21st century learning environment.
A 21st century learning environment differs from previous learning environments. Table 1 shows the
differences between past learning environments and a 21st century learning environment.
Table 1: Differences between today's learning environments and the past.
Previous Learning Environments A 21st Century Learning Environment
Teacher-centred classes Learner-centred classes, with teachers as facilitators
/collaborators
Focused on listening, speaking, reading, and writingskills Focused on interpersonal, interpersonal and presentational skills
Emphasised on the educator as presenter Emphasised on the learner as a doer
Used technology as a supplement tool Integrated technology intoinstructions to enhance learning
Provide same learning environment to all learners Provide learning environment based on individual needs
Traditional learning environment from textbooks Personallearning environment that meet real world tasks
Testing to find out what students don't know Assessing to find out what students can do based on rubric
Learning for school Learning for life
Many researches have been done by educators and researchers to help practitioners integrate 21st century skills into the learning environment. The following framework was selected for this study:
• Assessment and Teaching of 21stCentury Skills (ATCS) was developed by the University of Melbourne and sponsored by Cisco, Intel and Microsoft in 2009. This project aims to provide definitions of 21stcentury skills and design innovative assessment tasks that can be used in the classroom.
• International Society for Technology in Education (ISTE) framework; revised 21st century skills based on student standards and technology in the 2007 curriculum.
• Partnership for 21st century skills (P21) framework was developed in the United States for K12 education purposes in 2006.
• Seven C's framework proposed by Bernie Trilling in 2006. This framework attains and applies the basic 3Rs (reading, `riting and `rithmatic).
• EnGauge framework was introduced by Metiri Group and NCREL in 2003. This framework emphasises more on new contextual skills and knowledge.
From the above frameworks, only P21 and Engauge frameworks focus on the skills needed to improve the quality of teaching and learning (Voogt & Roblin, 2010).
MATERIALS AND METHODS
The aim of this study is to determine the core soft skills related to 21st century learning skills, which will be the focus ofan Integrated Online Project Base Collaborative Learning model.In order to achieve this aim, the study was conducted qualitatively in the form of a document review. Several previous study reports, conference proceedings, and journals have been referred to as a literature review, and analysedwith the data collected using a matrix table (Strauss and Corbin, 1990). According to Sallabas (2013),and Best and Kahn (1998), the document review method is the most appropriate tool to collect information in a qualitative study. Moreover, Onwuegbuzie, Leech, and Collins (2012) believe that the variables relevant to the topic can be identified by conducting a quality review of the literature.According to Stewart (2009), the materials and resources that can be used to carry out the analysis and interpretation, include (i) journals and books, (ii) research literature, and (iii) research papers and scholarly materialreports.
Results:
Current conceptual frameworks for “21st Century Skills” include the ACTS (2009) by Melbourne University, the ISTE framework by the American Association of Colleges and Universities (2007), the Partnership for 21st Century Skills (2006), Seven C's framework (2006), and the EnGauge framework from Metiri/NCREL (2003). The elements of 21st century skills, whichare defined based on the current conceptual framework, are summarized and presented in Table 2.
65 Sharifah Nadiyah Razali et al, 2014
Journal of Applied Science and Agriculture, 9(11) Special 2014, Pages: 63-68 Table 2: Current conceptual frameworks for “21stCentury Skills”.
No Framework Soft skills element
1 ATCS (2009) ii. i. Creativity and Innovation
Critical Thinking, Problem Solving, and Decision Making
iii. Leadership
iv. Communication and Collaboration
2 ISTE (2007) ii. i. Creativity and Innovation
Critical Thinking, Problem Solving, and Decision Making
iii. Communication and Collaboration
3 P21 (2006) i. Critical Thinking and Problem Solving
ii. Creativity and Innovation
iii. Communication and Collaboration
4 7c's (2006) i. Critical Thinking and Doing
ii. Creativity
iii. Collaboration
iv. Cross-cultural Understanding
v. Communication
vi. Computing
vii. Career and Self-reliance
5 EnGauge (2003) v. Managing Complexity, Creativity, and Higher Order Thinking
vi. Collaboration, Social, and Communication
Based on these frameworks, a matrix table was drawn to analyse the21st century core skills. The resultsof which are illustrated in Table 3.
Table 3: 21st century skill's matrix table.
Skills ATCS Framework
(2009) ISTE Framework (2007) P21 Framework
(2006) Seven Cs Framework
(2006) EnGauge Framework
(2003)
Critical Thinking
Problem Solving
Communication
Collaboration
Creativity
Decision Making
Innovation
Leadership
Social
From the table, there are five skills, namely collaboration, communication, problem solving, critical thinking, and creative thinking. However, according toGreenhill (2009),with students, the creative thinking skill comes after collaboration, communication, problem solving, and critical thinking skills. Therefore, this study only focuses on these four core skills (collaboration, communication, problem solving, and critical thinking).
Discussion:
The four core domains for 21stcentury learning are based on ATCS (2009), ISTE, (2007),P21, (2006), Seven Cs, (2006), and EnGauge, (2003) frameworks, coupled with collaboration, communication, problem solving, and critical thinking soft skills. All of the skills needed align with surveys on what employers seek by College Graduates by NACE (2012)and Casner-Lottoand Barrington (2006). These surveys also address collaboration, communication, problem solving, and critical thinking, as being the most wanted skills for college graduates. Based on the P21 skills framework, all four core skills have been summarized in Table 4below.
Conclusion:
The skills that employers demand are changing. Currently, soft skills need to be obtained by graduatesto enhance their prospects of good employment. Research has shown that graduates from Malaysian polytechnicsdo not meet the level of competency and working attitude expected by industries. HEI's need to produce graduates that meet the skills required by employers. Therefore, the development of a soft skills study plan is needed.The four core domainsof 21stcentury learning are based on ACTS (2009), ISTE (2007), P21
66 Sharifah Nadiyah Razali et al, 2014
Journal of Applied Science and Agriculture, 9(11) Special 2014, Pages: 63-68
(2006), Seven Cs(2006), and EnGauge (2003) frameworks, coupled with collaboration, communication, problem solving, and critical thinking soft skills. Research by Sancho et al., (2011)shows that collaborative learning promotes the development of soft skills.
Table 4: Summary of the P21 skills framework. (Source: Greenhill (2009)).
Critical Thinking Reason effectively
• Use various types of reasoning (i.e., inductive, deductive, etc.) as appropriate to the situation
Use systems thinking
• Analyse how parts of a whole interact with each other to produce overall outcomes in complex systems
Problem Solving Make judgments and decisions
• Effectively analyse and evaluate evidence, arguments, claims, and beliefs
• Analyse and evaluate major alternative points of view
• Synthesize and make connections between information and arguments
• Interpret information and draw conclusions based on the best analysis
• Reflect critically on learning experiences and processes
Solve problems
• Solve different kinds of non-familiar problems in both conventional and innovative ways
• Identify and ask significant questions that clarify various points of view and lead to better solutions
Communication Communicate clearly
•
• Articulate thoughts and ideas effectively using oral, written, and nonverbal communication skills in a variety of forms and contexts
• Listen effectively to decipher meaning, including knowledge, values, attitudes and intentions
• Use communication for a range of purposes (e.g., to inform, instruct, motivate and persuade)
Utilize multiple media and technologies, and know how to judge their effectiveness a priori as well as assess their
impact
• Communicate effectively in diverse environments (including multi-lingual)
Collaboration Collaborate with others
•
• • Demonstrate the ability to work effectively and respectfully with diverse teams
Exercise flexibility and willingness to be helpful in making necessary compromises to accomplish a common goal Assume shared responsibility for collaborative work, and value the individual contributions made by each team
member
Collaborative learning, which is a learning approach that leads to the theory of constructivism (Vygotsky, 1978), has been used as a learning strategy worldwide for many years (Ashton-Hay, 2006). According to Johnson, and Johnson (1989), learning tends to be most effective when students are in a position to work collaboratively in expressing their thoughts, discussing and challenging ideas with others, and working together towards a group solution to a given problem. Research has shown that undergraduates improve their academic performance by interacting with their peers (Chen, 2011). Even though the benefits of collaborative learning are widely acknowledged;as previously discussed, graduates still lack the soft skills that are currently demanded by employers.
Fig. 1: 21st Century Core Soft Skills Research Focus.
Filigree (2012) claimsthat there are five maturity stages in collaboration.In order to develop an effective online collaborative learning, collaborative learning and an advanced instructional model are fully supported by technology, people and process. Technology is seen as an important enabler for improving student learning outcomes. However, to get the greatest value from technology, best practices are required.Therefore,
67 Sharifah Nadiyah Razali et al, 2014
Journal of Applied Science and Agriculture, 9(11) Special 2014, Pages: 63-68
collaborative learning and advanced instructional models, which are fully supported by technology, people, and processes are proposed to develop the 21st century skills of: Collaboration, Communication, Critical Thinking and Problem Solving. This study only focuses on these four core skills (i.e., collaboration, communication, critical thinking, and problem solving) for an Integrated Online Project Base Collaborative Learning model.
ACKNOWLEDGEMENT
The author wishes to express her gratitude to her supervisors, Associate Professor Dr. Hanipahbinti Hussin, Assoc. Prof. Dr. Faaizah Shahbodin and Dr. Norasikenbinti Bakar, who were abundantly helpful and offered invaluable assistance, support and guidance. The author would also like to thank the management, lecturers, and students of the various polytechnics for their involvement, cooperation, and support in this study. Last but not least, the author would like to express an infinite love to her beloved husband, family, and colleagues for giving much support and encouragement.This research was done by a PhD candidate from UTeM.
REFERENCES
Ahmad, S., N. Ali, M.F. Hamzah, 2011. Kebolehpasaran Graduan UKM: Satu Kajian Perbandingan Antara Graduan Disiplin Sains dengan Bukan Sains. Jurnal Personalia Pelajar, 14: 81-90.
Ashton-Hay, S., 2006. Constructivism and powerful learning environments: create your own! In 9th International English Language Teaching Convention.
Best, J.W., J.V. Kahn, 1998. Research In Eduction. United State of America: A Viacom Company.
Casner-Lotto, J., L. Barrington, 2006. Are they really ready to work?: Employers’ perspectives on the basic knowledge and applied skills of new entrants to the 21st century U.S. workforce. United States: Conference Board: Partnership for 21st Century Skills: Corporate Voices for Working Families: Society for Human Resource Management.
Chen, Y., 2011. Learning styles and adopting Facebook technology. In Technology Management in the Energy Smart World (PICMET) (pp: 1-9).
Filigree, C., 2012. Instructional Technology and Collaborative Learning Best Practices: Global Report and Recommendations.
Greenhill, V., 2009. 21 st Century Learning Environments.
Harvey, L., W. Locke, A. Morey, 2004. Enhancing employability, recognising diversity. Unpublished doctoral dissertation. Retrieved from http://heer.qaa.ac.uk/SearchForSummaries/Summaries/Pages/GLM24. aspx
Hurrell, S.A., 2009. Soft skills deficits in Scotland: their patterns, determinants and employer responses. Johnson, D.W., R.T. Johnson, 1989. Cooperation and competition: Theory and research. Edina, MN: Interaction Book Company.
Juen, J.W.Y., V. Pang, J.W. Vitales, 2010. OBE Curriculum Implementation Process in Politeknik Kota Kinabalu: A Possible Evaluation Framework. In Prosiding Seminar Transformasi Pendidikan Teknikal (pp: 172¬181).
Moss, P., C. Tilly, 2001. Stories Employers Tell. Race, SKills and Hiring in America. New York: Russel Sage Foundation.
NACE, 2012. Job Outlook: The Candidate Skills/Qualities Employers Want. Retrieved from http://www.naceweb.org/surveys/job-outlook.aspx.
Onwuegbuzie, A., N. Leech, K. Collins, 2012. Qualitative Analysis Techniques for the Review of the Literature. Qualitative Report, 17: 1-28. Retrieved from http://files.eric.ed.gov/fulltext/EJ981457.pdf
Razali, S.N., F. Shahbodin, N. Bakar, H. Hussin, M.H. Ahmad, 2014. Perceptions towards the Usage of Collaborative Learning in Teaching and Learning Processes at. In International Conference on Advances in Computing, Communication and Information Technology.
Razali, S., F. Shahbodin, N. Bakar, H. Hussin, M.H. Ahmad, N. Sulaiman, 2013. Incorporating Learning Management System with Social Network Sites to Support Online Collaborative Learning: Preliminary Analysis. Advances in Visual ..., 549-557. Retrieved from http://link.springer.com/chapter/10.1007/978-3-319-02958-0_50.
Sallabas, M.E., 2013. Analysis of narrative texts in secondary school textbooks in terms of values education. Educational Research and Reviews, 8(8): 361-366. doi:10.5897/ERR12.190.
Sancho, P., J. Torrente, E.J. Marchiori, B. Fernández-Manjón, 2011. Enhancing moodle to support problem based learning. The Nucleo experience. In IEEE Global Engineering Education Conference (EDUCON) (pp: 1177-1182).
Stewart, A.M., 2009. Research Guide for A Students and Teachers.
Strauss, A., J. Corbin, 1990. Basics of qualitative research: Grounded theory procedures and techniques. Newburry Park: CA: Sage.
68 Sharifah Nadiyah Razali et al, 2014
Journal of Applied Science and Agriculture, 9(11) Special 2014, Pages: 63-68
Voogt, J., N.P. Roblin, 2010. 21st Century Skills - Discussion Paper.
Vygotsky, L.S., 1978. Mind in Society. Cambridge (Massachusetts): Harvard University Press.
Tidak ada komentar:
Posting Komentar