Identify Cancer Using Personalized Machine Learning Product
issue 1

Identify Cancer Using Personalized Machine Learning Product

Prof. Roshan. R Kolte, Miss. Pooja S. Korde, Miss. Payal B. Bhure, Miss. Minal D. Wath, Miss. Nilima N. Bobade, Miss. Sapna S. Patel
Department of Information Technology, KDKCE, RTMNU, Maharashtra Nagpur-440009.


The diagnosis of cancer at an early stage is of utmost importance if it is meant to degrade high mortality rate. The global cancer screening program points to visualise positron emission tomography (PET) and computed tomography (CT) examinations amongst most aged groups at risk to enhance the early detection rate. Although use of aggressive techniques, symptoms scarcely appear until disease is advanced making it problematic for radiologist to identify lesions. Unfortunately, most cancer patients misery at advanced stages result in miserable with five-year survival rate of 17.8% and for detached tumours, being only 4%. Genuine and precise evidence is the basis of disease control initiatives. More than 85% of the disease is linked to tobacco consumption. Tumor detection and removal is one medical subject that still remains stimulating in the field of biomedicine. Early imaging techniques such as pneumoencephalography and cerebral angiography had the disadvantage of being invasive and hence the CT and MRI imaging techniques assistance to the surgeons in as long as a better vision. In this paper, tumor image processing includes three stages namely pre-processing, segmentation and morphological operation. In addition, genetic influences, exposure to environmental pollutants, second hand smoking inflate disease swiftly. Remedies including chemotherapy, radiotherapy, surgery, epidermal receptive drugs escalate survival rate and superiority of life. After the acquisition of the source image, it is pre-processed by changing the original image to gray scale in addition high pass filter for noise removal and median filter for quality enhancement is as long as which is followed by enhancement stage resulting with historgramic equivalent image. Finally, segmentation is done by resources of watershed algorithm. The above proposed methodology is accommodating in generating the reports automatically in less span of time and improvement has resulted in extracting many inferior parameters of the tumor.This technique is more about diagnosing at early and decisive phases with intelligent computational techniques with various distortion removals by subdivision techniques and algorithms which is the root concept of image processing. Recognition of CT images obtained from cancer institutes is analysed using MATLAB.


Cancer can be enlightened as uncontrolled cell growth having capability to spread all over the body. Our body encompasses red blood cells (RBC’s). The main purpose is to supply additional oxygen (O2) to all parts of the body with the help of blood current, due to which blood seems red [1]. In the Brain & lungs, tissue accepts oxygen (O2) because of RBC’s only. The genetic content of erythrocytes has high attentiveness of haemoglobin. The cell membrane consists of proteins and lipids which is support of psychological cell function. They do not contain any imperative part of cell, which includes haemoglobin. Around 20 lakhs new RBC’s are fashioned per second [2]. The cells are fashioned in the bone marrow and rotate through the body for about 4 months to & from in arteries and veins. Each rotation revenues about 20 seconds. About 75% of the cells & mainstream attentiveness of blood in the human physique are red blood cells [3, 4]. Young Dutch biologist Jan Swammerdam described it perfectly with an microscope in 1658. Brain tumor is an abnormal mass of tissue in which cells grow and multiply uncontrollably [7], seemingly unchecked by the apparatuses that control normal cells. The symptoms of brain tumor hinge on on the tumor size, type and location. Some common indicators of brain tumor are-Headaches. Nausea and [2]. Vomiting. Fluctuations in speech, vision or hearing. Problems in walking. Seizures or convulsions. Changes in mood, personality or ability to quintessence and problems with memory. A brain tumor is a primary or secondary type dependent on its location of origin.

Benign: Benign tumors are non-cancerous mass of cells that produces slowly in the brain. It usually stays in one place and does not spread. Most of the benign brain tumors are detected by CT and MRI scans.

Malignant: A malignant brain tumor is a rapidly increasing cancer that spreads to other areas of the brain and spine. Most of the malignant brain tumors are secondary [3] but can be primary too. These tumors are life threatening Image processing methods- Image processing is castoff to analyse images at the lowest level providing any quality. These operations do not increase probability of image evidence content, but they lessening it if entropy is an Research Article Abstract The analysis of lung cancer at an early stage is of greatest importance if it is meant to degrade high mortality rate. The global lung screening package points to visualise positron emission tomography (PET) and computed tomography (CT) inspections amongst most aged groups at risk to increase the early detection rate. Although use of invasive techniques, symptoms barely appear until disease is advanced creation it difficult for radiologist to identify lesions. Unfortunately, most lung cancer patients suffering at cutting-edge stages result in dismal with five-year persistence rate of 17.8% and for distant tumours, being only 4%. Genuine and accurate information is the basis of disease control initiatives. Added than 85% of the disease is related to tobacco consumption. In scheming, genetic factors, exposure to environmental pollutants, second hand smoking bloat disease rapidly. Remedies as well as chemotherapy, radiotherapy, surgery, epidermal interested drugs escalate survival rate and quality of life. This arrangement is more about spotting at early and critical stages with smart computational techniques with many misrepresentation removals by segmentation techniques and algorithms which is the root concept of image processing. Recognition of CT images obtained from cancer institutes is analysed using MATLAB. Keywords Lung cancer, MATLAB, CT images, Distortion removal, Segmentation, The key requirement of processing is to recover pixel intensity by converting from discrete to digital image, segmenting to pixels, booming out mathematical operations on pixels, and reconstructing of image with improved quality [11]. Preprocessing of CT images is the initial step in image analysis shadowed by segmentation course and ended with some morphological procedures are applied to detect the cancer spots/cells in the duplicate. Also it can be used to govern the amount of scattering of cancer i.e. what percentage of lung is affected with cancer. The morphological actions are basically applied by comparing the size and shape of the cancer cell with normal cell, and then the diseased cells images displayed onto grey scale image with maximum intensity (255).

2 Methodology & Algorithm:

The Algorithm is proposed as is given in Figure 1 as follows:

Figure 1: Algorithm for Cancer

2.1 Input image:

Aimed at any category of cancer, initially image of internal parts of the body should be obtained. CT scan also recognized as X-ray computed tomography makes use of X-ray for capturing the images from various angles and merge these images to generate cross sectional tomographic image of particular areas of scanned tissues i.e. it allows the person to see the status inside body without non-invasive techniques [12]. The Brain & lungs are the major significant organs of respiration in humans as well as other animals. In mammals and popular of the vertebrates, two Brain & lungs are located on moreover side of the heart near to backbone. Their character is to take oxygen from the atmosphere and transfer it into the bloodstream, and to give out carbon dioxide from the bloodstream into the atmosphere. Humans have 2 Brain & lungs, right & left. They are situated inside the thoracic cavity of chest. The right lung being bigger than the left, shares space with the heart. The Brain & lungs together weigh approximately 1.4 kg. Plural case in which Brain & lungs are enclosed allows inner and external walls to slip over each other without more friction. This sac encompasses each lung and divisions each lung into sections named lobes. The accurate lung has 3 lobes and the left has two. The lobes are extra classified into bronchopulmonary segments and lobules. The Brain & lungs possess a unique blood supply, receiving deoxygenated blood from the heart for receiving oxygen (i.e. pulmonary circulation) and a illustrious supply of oxygenated blood (the bronchial circulation). The tissue of the Brain & lungs can be struck by a number of diseases, including pneumonia and lung cancer. Chronic diseases such as chronic obstructing pulmonary disease and emphysema (damaging alveoli’s in the Brain & lungs) can be linked to smoking or exposure to harmful substances. Diseases such as bronchitis can also involve the respiratory tract. The image of affected Brain & lungs and normal Brain & lungs is quite different and easily differentiable. These CT images are converted to grayscale images.

2.2 Grayscale image:

In the computing world, a grayscale image is a digital image, in which the value of each pixel is an individual sample, i.e., it carries only intensity or amplitude. Images of this kind, also known as white (highest intensity) & black (lowest intensity) images, consist exclusive shades of gray [13].Grayscale images are the result of measuring the intensity of well-lit at each pixel in a solitary band of the light spectrum. They can also be attained from a full colour image. The reason behind choosing grayscale image is even minimum pixel intensity is also helpful in detecting changes in the cells. In fact a gray colour is one in which the R, G, B planes have equal amplitude, the brightness levels represented as a number from decimal 0 to 255. For every pixel in an RGB grayscale image, G = B = R. The intensity varies in proportion with the number representing the brightness levels of the RGB colours. Black is represented by R = G = B = 0 and white is represented by R = G = B = 255.

2.3 High pass filter:

As the name suggests, it passes the frequency above certain cut-off frequency and attenuates all the frequency below the cut-off frequency. A high pass filter is mainly used for sharpening images. It is done when contrast is enhanced between the adjacent areas with increase or decrease in brightness level. A high pass filter sets high threshold cut-off to obtain information of an image while cutting the low frequency data. The basis of the high pass filter is designed to increase the amplitude of the median pixel relative to adjacent pixels. The kernel array generally contains a single value at its center, which is completely surrounded by other values. The values may be defined in terms of positive or negative.

2.4 Median filtering:

It is a nonlinear digital filter secondhand to eliminate some noise in the image. To detect some edge in the image, initially noise should be detached up to some threshold value and then edge removal is completed. Hence the median filter is placed before edge detector. Its main feature is it eliminates noise without edge exclusion. Median filter is same as that of averaging filter, in which each production image pixel is set corresponding to the average value of neighbouring pixel of the input image. The median filter is added sensitive to mean values and less sensitive to extreme values of pixel which helps in noise reduction.

2.5 Threshold segmentation:

Threshold segmentation is one of the easiest segmentation methods. The pixels are divided depending upon their amplitude levels. There are various types of segmentation depending on parameters like threshold values of pixels, edge based, region based, clustering etc. It relates grayscale image to binary image, also called as mapping. After this action, image is separated into 2 pixel values only, 0 & 1. If there is an image which encompasses dark building on bright background, then thresholding can be used to separate the structure. Also to set a particular threshold value, many sub algorithms can be used e.g. histogram estimation, optimal thresholding, iterative thresholding, K means clustering. In K means clustering, grayscale image is divided into K segments i.e. K-1 threshold values, thereby reducing variance. Many images which are made of pixels contains more than one value e.g. RGB. If we separate these pixel values for R, B & G, they are called channels.

2.6 Watershed algorithm:

It can be explained using a practical idea. Contemplate a surface occupied into the lake with a hole at least, so that water will start filling through that hole and will go on increasing. If 2 such surfaces very close to each other are placed then a point will come where the water will overlap and mix from both the surface. At that point only, dams are built so that water does not mix. These dams are watershed lines and by the process of filling water, surface separation can be done. There are many methods to carry out this algorithm. One of the most common watershed algorithms was introduced by F.Meyer called ‘Meyer watershed algorithm’. This algorithm applies only to grayscale image [14].

2.7 Morphological operations:

Mathematical morphology is a technique to evaluate segmented structures/images based on random functions, set theories etc. It is strictly applied to digital images only. E.g. binary morphological operations explores a particular grayscale image with simple, predefined shapes, and concluding how this shape fits into the image provided or what part of the image gets missed due to this shape.


MRI/CT images of the brain are processed for the recognition of tumor using MATLAB. The block diagram in Figure 2 shows the overall processing technique. The Methodologies employed here are

Figure 2: Block Diagram

Block-Diagram for Preprocessing the image data by suppressing the undesired distortions and enhances some of the image features that will be helpful in further processing. The goal of Pre-processing is to remove the noise and to provide Contrast Enhancement [5] to improve the image quality. The Figure 2 shows the block diagram for Pre processing. The functions performed by preprocessing process is Gray scale conversion Noise removal Contrast Enhancement.

3.1 Processing grayscale image with median filter:

The proposed work flow is given in Figure 3

Figure 3: Algorithm for preprocessing of CT scan image

The CT image is transformed to grayscale image to achieve mathematical operations shown in Figure 4.1 & Figure 4.2.

Figure 4.1: Grayscale Image for lungs
Figure 4.2: Grayscale Image for brain

Now it is complete to carry out further mathematical operations, the image is passed done a high pass filter to enhance the information needed shown in Figure 5.1 & Figure 5.2

Figure 5.1: After applying Gaussian HP filter for Lungs
Figure 5.2: After applying Gaussian HP filter for brain

Final pre-processing is done by passing salt and pepper noise through a median filter which will allow removing noise completely from the image while restoring edges shown in Figure 6.1 & Figure 6.2.

Figure 6.1: Median filtered for lungs
Figure 6.2: Median filtered for brain

The production of the watershed image contains pixels of ill lung & brain tissues which are not labelled. These pixels are in the form of watershed lines.Figure7.1 shows the output image subsequently successful Morphological operations. This figure clearly signifies left lung being more infected by cancer with background stating the cancer section as compared to right lung of the given CT scan image & in brain in Figure 7.2.

Figure 7.1: Output image after processing for lungs
Figure 7.2: Output image after processing for brain

4 Conclusion and future work:

The overhead method is processed in two steps 1) Processing of noisy input image with filter and segmentation 2) Morphological operations on CT image. The cancer exaggerated lungs & brain region can be detected in the final output image to CT input image provided. The proposed method can also be applied to some supplementary cancer types like breast cancer, skin cancer etc. Also it finds its application in the medical research as well.

5 References:

  1. Lipowsky R, Sackmann E. Preface to volume 1a from cells to vesicles: introduction and overview. Handbook of Biological Physics. 1995.
  2. Lemjabbar-Alaoui H, Hassan OU, Yang YW, Buchanan P. Lung cancer: Biology and treatment options. BBA Reviews on Cancer. 2015; 1856(2):189- 210. Prathamesh Gawade et al. 222
  3. Pierige F, Serafini S, Rossi L, Magnani M. Cell-based drug delivery. Advanced Drug Delivery Reviews. 2008; 60(2):286-95.
  4. Villa CH, Anselmo AC, Mitragotri S, Muzykantov V. Red blood cells: supercarriers for drugs, biologicals, and nanoparticles and inspiration for advanced delivery systems. Advanced Drug Delivery Reviews. 2016
  5. Dubey AK, Gupta U, Jain S. Epidemiology of lung cancer and approaches for its prediction: a systematic review and analysis. Chinese Journal of Cancer. 2016; 35(1):71.
  6. Scagliotti GV, De Marinis F, Rinaldi M, Crino L, Gridelli C, Ricci S, Matano E, Boni C, Marangolo M, Failla G, Altavilla G. Phase III randomized trial comparing three platinum-based doublets in advanced non–small-cell lung cancer. Journal of Clinical Oncology. 2002; 20(21):4285-91.
  7. Non-Small Cell Lung Cancer, Available at: pdf/non-small-cell.pdf, Adapted from National Cancer Institute (NCI) and Patients Living with Cancer (PLWC), 2007, (accessed July 2011).
  8. Tarawneh M., Nimri O., Arqoub K., Zaghal M., Cancer Incidence in Jordan 2008, Available at: 0Registry_2008%20Report_1.pdf, 2008, (accessed July 2011).
  9. Lung Cancer Database, Available at:, (accessed July 2011).
  10. Gonzalez R.C., Woods R.E., Digital Image Processing, Upper Saddle River, NJ Prentice Hall, 2008.
  11. Cristobal G., Navarro. R., Space and frequency varient image enhancment based in Gabor representation, Pattern Recognition Letters, Elsevier, 1994, 15, p. 273-277.
  12. Krishan A., Evaluation of Gabor filter parameters for image enhancement and segmentation, in Electronic Instrumentation and Control Engineering, Master. Punjab: Thapar University, 2009, p. 126.
  13. Nunes É.D.O., Pérez M.G., Medical Image Segmentation by Multilevel Thresholding Based on Histogram Difference, presented at 17th International Conference on Systems, Signals and Image Processing, 2010.
  14. Venkateshwarlu K., Image Enhancement using Fuzzy Inference System, in Computer Science & Engineering, Master thesis, 2010.

Related posts

Handwritten Character Recognition Using Deep Learning


Hand Gesture Vocalization with Home Automation


Driver Drowsiness Detection System


Leave a Comment