[WEB] robots.txt

Security Study

[WEB] robots.txt

PreGoogler 2019. 8. 1. 23:26

웹 해킹 문제 중 점수가 낮은 문제를 풀다보면 'robots.txt' 파일을 활용해서 문제를 푸는 게 많은데요, robots.txt가 무슨 파일이고 어떻게 활용할 수 있는지 정리해 보겠습니다.

우선 robots.txt파일이 무엇인지 알아보기 위하여 검색을 해보니

Robots.txt is a file that contain path which cannot crawled by bot most of time search-engine bots like google bot or etc. It tells search-engine that this directory is private & can not be crawled by them.

If yo are site owner & want to make robots.txt file , then go following link , it will create robots.txt file for you.

http://www.mcanerin.com/EN/search-engine/robots-txt.asp

so just for now , robots.txt is pretty much what websites use to block certain pages from search engines.

Here is a sample : http://www.whitehouse.gov/robots.txt

와 같이 검색되었습니다.

간단히 해석하자면 robots.txt는 웹 크롤러(구글 봇 또는 검색 엔진 봇 등)에게 크롤링 되지 않는 경로를 포함하고 있어서 robots.txt는 특정 페이지를 검색 엔진에게 검색되는 것을 막기 위해 존재한다고 할 수 있겠습니다.

이 파일이 사용된 해킹문제 예제는
http://natas3.natas.labs.overthewire.org/

에서 id : natas3, pw : sJIJNW6ucpu6HPZ1ZAchaDtwd7oGrD14를 치고 들어가셔서

URI 주소 뒤에 robots.txt를 추가하셔서 엔터키를 누르시면 robots.txt파일 내용을 확인할 수 있습니다.

이 사이트에서 검색이 제한된 URI는 http://natas3.natas.labs.overthewire.org/s3cr3t/ 임을 알 수 있으므로 이 경로로 들어가면 다음 단계로의 패스워드를 알 수 있습니다.